Re: Create Release Candidate GitHub Workflow

2024-03-29 Thread Karl Wright
Svn url for review:

 https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26

Our area in this svn:

 https://dist.apache.org/repos/dist/dev/manifoldcf
<https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26>

Our area for releases in this svn:

 https://dist.apache.org/repos/dist/release/manifoldcf
<https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26>

To move a release candidate from one to the other (e.g. do the release):

svn move
https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26
https://dist.apache.org/repos/dist/release/manifoldcf/apache-manifoldcf-2.26
<https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26>



On Fri, Mar 29, 2024 at 3:09 PM Karl Wright  wrote:

> The script as it exists now (release.bat) creates the release artifacts,
> signs them, and copies them into the svn development area.  To actually
> release, you then just need to move them (using svn move) to the release
> part of the area.
>
> The machine I used to do this on died but the svn URL for the dev area is
> the one I would send around for the review and signoff for the releases.
> Let me look it up.
>
>
>
>
> On Fri, Mar 29, 2024 at 11:44 AM Piergiorgio Lucidi <
> piergior...@apache.org> wrote:
>
>> The open points now are related to the last two steps of our workflow:
>>
>>- Generating the file hashes using a shared GPG secret (in progress...)
>>- Updating SVN public folders for publishing releases (TODO)
>>
>> We should agree with the Automated Release Process before proceeding:
>>
>> https://issues.apache.org/jira/browse/INFRA-25665?focusedCommentId=17832209=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17832209
>>
>> Practically INFRA will generate a new GPG key and they will add the public
>> key into the ManifoldCF KEYS file.
>> This will let us manage the generation of file hashes using a GitHub
>> actions.
>>
>> Do you all agree with this?
>> Please let me know.
>> Thanks.
>>
>> Cheers,
>> PG
>>
>> Il giorno mar 26 mar 2024 alle ore 17:19 Karl Wright 
>> ha scritto:
>>
>> > Well we obviously need something that works, and just updating the
>> script
>> > to use github commands is one way to do that and would generate releases
>> > like we do now.
>> >
>> >
>> >
>> > On Tue, Mar 12, 2024 at 9:00 AM Piergiorgio Lucidi <
>> piergior...@apache.org
>> > >
>> > wrote:
>> >
>> > > Hi Karl,
>> > >
>> > > I tried to look at the current process but It's not clear to me what
>> > > I should do now.
>> > > Should I just use svn commands from GitHub in order to execute the
>> same
>> > > steps?
>> > > Or do we have an alternative way without using svn?
>> > >
>> > > Do you know if we have something GitHub-centric for managing releases?
>> > >
>> > > Cheers,
>> > > PG
>> > >
>> > > Il giorno mar 5 mar 2024 alle ore 21:53 Karl Wright <
>> daddy...@gmail.com>
>> > > ha
>> > > scritto:
>> > >
>> > > > Very good!
>> > > >
>> > > > In the past we've often had to add new commits to the release branch
>> > and
>> > > > create a new RC.  The RCs have to be copied into the staging area
>> (in
>> > an
>> > > > svn repo) and then when actually released there's a simple svn
>> command
>> > to
>> > > > do that.  Are you familiar with that process?  For this reason it
>> may
>> > be
>> > > > better to separate the creation of the release branch from
>> everything
>> > > else.
>> > > >
>> > > > Karl
>> > > >
>> > > >
>> > > > On Tue, Mar 5, 2024 at 9:23 AM Piergiorgio Lucidi <
>> > > piergior...@apache.org>
>> > > > wrote:
>> > > >
>> > > > > Hi folks,
>> > > > >
>> > > > > I have just pushed a potential GitHub workflow for creating the
>> > release
>> > > > > candidate branch and artifacts [1]. The related issue is
>> available in
>> > > > JIRA
>> > > > > [2].
>> > > > >
>> > > > > We need to test it but I think that it could be something close to
>> > what
>> > > > we
>> > > > > need:
>> > > > >
>> > > > > 1. Create the new branch
>> > > > > 2. Update CHANGES.txt, build.xml and all the poms
>> > > > > 3. Run the Ant build
>> > > > > 4. Run the Maven build (if we want to push artifacts on public
>> repos)
>> > > > > 5. Check licenses using Apache RAT
>> > > > > 6. Commit and push the new branch
>> > > > > 7. Upload artifacts as GitHub release assets
>> > > > >
>> > > > > Any feedback?
>> > > > > Thanks everyone.
>> > > > >
>> > > > > Cheers,
>> > > > > PG
>> > > > >
>> > > > > [1] -
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/manifoldcf/blob/CONNECTORS-1754/.github/workflows/create-release-candidate.yml
>> > > > >
>> > > > > [2] - https://issues.apache.org/jira/browse/CONNECTORS-1754
>> > > > > --
>> > > > > Piergiorgio
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Piergiorgio
>> > >
>> >
>>
>>
>> --
>> Piergiorgio
>>
>


Re: Create Release Candidate GitHub Workflow

2024-03-29 Thread Karl Wright
The script as it exists now (release.bat) creates the release artifacts,
signs them, and copies them into the svn development area.  To actually
release, you then just need to move them (using svn move) to the release
part of the area.

The machine I used to do this on died but the svn URL for the dev area is
the one I would send around for the review and signoff for the releases.
Let me look it up.




On Fri, Mar 29, 2024 at 11:44 AM Piergiorgio Lucidi 
wrote:

> The open points now are related to the last two steps of our workflow:
>
>- Generating the file hashes using a shared GPG secret (in progress...)
>- Updating SVN public folders for publishing releases (TODO)
>
> We should agree with the Automated Release Process before proceeding:
>
> https://issues.apache.org/jira/browse/INFRA-25665?focusedCommentId=17832209=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17832209
>
> Practically INFRA will generate a new GPG key and they will add the public
> key into the ManifoldCF KEYS file.
> This will let us manage the generation of file hashes using a GitHub
> actions.
>
> Do you all agree with this?
> Please let me know.
> Thanks.
>
> Cheers,
> PG
>
> Il giorno mar 26 mar 2024 alle ore 17:19 Karl Wright 
> ha scritto:
>
> > Well we obviously need something that works, and just updating the script
> > to use github commands is one way to do that and would generate releases
> > like we do now.
> >
> >
> >
> > On Tue, Mar 12, 2024 at 9:00 AM Piergiorgio Lucidi <
> piergior...@apache.org
> > >
> > wrote:
> >
> > > Hi Karl,
> > >
> > > I tried to look at the current process but It's not clear to me what
> > > I should do now.
> > > Should I just use svn commands from GitHub in order to execute the same
> > > steps?
> > > Or do we have an alternative way without using svn?
> > >
> > > Do you know if we have something GitHub-centric for managing releases?
> > >
> > > Cheers,
> > > PG
> > >
> > > Il giorno mar 5 mar 2024 alle ore 21:53 Karl Wright <
> daddy...@gmail.com>
> > > ha
> > > scritto:
> > >
> > > > Very good!
> > > >
> > > > In the past we've often had to add new commits to the release branch
> > and
> > > > create a new RC.  The RCs have to be copied into the staging area (in
> > an
> > > > svn repo) and then when actually released there's a simple svn
> command
> > to
> > > > do that.  Are you familiar with that process?  For this reason it may
> > be
> > > > better to separate the creation of the release branch from everything
> > > else.
> > > >
> > > > Karl
> > > >
> > > >
> > > > On Tue, Mar 5, 2024 at 9:23 AM Piergiorgio Lucidi <
> > > piergior...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > I have just pushed a potential GitHub workflow for creating the
> > release
> > > > > candidate branch and artifacts [1]. The related issue is available
> in
> > > > JIRA
> > > > > [2].
> > > > >
> > > > > We need to test it but I think that it could be something close to
> > what
> > > > we
> > > > > need:
> > > > >
> > > > > 1. Create the new branch
> > > > > 2. Update CHANGES.txt, build.xml and all the poms
> > > > > 3. Run the Ant build
> > > > > 4. Run the Maven build (if we want to push artifacts on public
> repos)
> > > > > 5. Check licenses using Apache RAT
> > > > > 6. Commit and push the new branch
> > > > > 7. Upload artifacts as GitHub release assets
> > > > >
> > > > > Any feedback?
> > > > > Thanks everyone.
> > > > >
> > > > > Cheers,
> > > > > PG
> > > > >
> > > > > [1] -
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/manifoldcf/blob/CONNECTORS-1754/.github/workflows/create-release-candidate.yml
> > > > >
> > > > > [2] - https://issues.apache.org/jira/browse/CONNECTORS-1754
> > > > > --
> > > > > Piergiorgio
> > > > >
> > > >
> > >
> > >
> > > --
> > > Piergiorgio
> > >
> >
>
>
> --
> Piergiorgio
>


Re: Create Release Candidate GitHub Workflow

2024-03-26 Thread Karl Wright
Well we obviously need something that works, and just updating the script
to use github commands is one way to do that and would generate releases
like we do now.



On Tue, Mar 12, 2024 at 9:00 AM Piergiorgio Lucidi 
wrote:

> Hi Karl,
>
> I tried to look at the current process but It's not clear to me what
> I should do now.
> Should I just use svn commands from GitHub in order to execute the same
> steps?
> Or do we have an alternative way without using svn?
>
> Do you know if we have something GitHub-centric for managing releases?
>
> Cheers,
> PG
>
> Il giorno mar 5 mar 2024 alle ore 21:53 Karl Wright 
> ha
> scritto:
>
> > Very good!
> >
> > In the past we've often had to add new commits to the release branch and
> > create a new RC.  The RCs have to be copied into the staging area (in an
> > svn repo) and then when actually released there's a simple svn command to
> > do that.  Are you familiar with that process?  For this reason it may be
> > better to separate the creation of the release branch from everything
> else.
> >
> > Karl
> >
> >
> > On Tue, Mar 5, 2024 at 9:23 AM Piergiorgio Lucidi <
> piergior...@apache.org>
> > wrote:
> >
> > > Hi folks,
> > >
> > > I have just pushed a potential GitHub workflow for creating the release
> > > candidate branch and artifacts [1]. The related issue is available in
> > JIRA
> > > [2].
> > >
> > > We need to test it but I think that it could be something close to what
> > we
> > > need:
> > >
> > > 1. Create the new branch
> > > 2. Update CHANGES.txt, build.xml and all the poms
> > > 3. Run the Ant build
> > > 4. Run the Maven build (if we want to push artifacts on public repos)
> > > 5. Check licenses using Apache RAT
> > > 6. Commit and push the new branch
> > > 7. Upload artifacts as GitHub release assets
> > >
> > > Any feedback?
> > > Thanks everyone.
> > >
> > > Cheers,
> > > PG
> > >
> > > [1] -
> > >
> > >
> >
> https://github.com/apache/manifoldcf/blob/CONNECTORS-1754/.github/workflows/create-release-candidate.yml
> > >
> > > [2] - https://issues.apache.org/jira/browse/CONNECTORS-1754
> > > --
> > > Piergiorgio
> > >
> >
>
>
> --
> Piergiorgio
>


Re: Create Release Candidate GitHub Workflow

2024-03-05 Thread Karl Wright
Very good!

In the past we've often had to add new commits to the release branch and
create a new RC.  The RCs have to be copied into the staging area (in an
svn repo) and then when actually released there's a simple svn command to
do that.  Are you familiar with that process?  For this reason it may be
better to separate the creation of the release branch from everything else.

Karl


On Tue, Mar 5, 2024 at 9:23 AM Piergiorgio Lucidi 
wrote:

> Hi folks,
>
> I have just pushed a potential GitHub workflow for creating the release
> candidate branch and artifacts [1]. The related issue is available in JIRA
> [2].
>
> We need to test it but I think that it could be something close to what we
> need:
>
> 1. Create the new branch
> 2. Update CHANGES.txt, build.xml and all the poms
> 3. Run the Ant build
> 4. Run the Maven build (if we want to push artifacts on public repos)
> 5. Check licenses using Apache RAT
> 6. Commit and push the new branch
> 7. Upload artifacts as GitHub release assets
>
> Any feedback?
> Thanks everyone.
>
> Cheers,
> PG
>
> [1] -
>
> https://github.com/apache/manifoldcf/blob/CONNECTORS-1754/.github/workflows/create-release-candidate.yml
>
> [2] - https://issues.apache.org/jira/browse/CONNECTORS-1754
> --
> Piergiorgio
>


[jira] [Commented] (CONNECTORS-1495) Brand new website

2023-12-19 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17798551#comment-17798551
 ] 

Karl Wright commented on CONNECTORS-1495:
-

Nice work!!
That was fast, and I agree with putting all the materials in the repo for it.  
Have you thought about licensing of those materials?  This would be the time...

> Brand new website
> -
>
> Key: CONNECTORS-1495
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1495
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: Site
>Affects Versions: ManifoldCF 2.9.1
>Reporter: Piergiorgio Lucidi
>Assignee: Piergiorgio Lucidi
>Priority: Major
> Fix For: ManifoldCF next
>
> Attachments: ManifoldCF-FluidoSkin.png, PDF-Rendition-1.png, 
> PDF-Rendition-2.png, Website - status - 20180510-2.png, Website - status - 
> 20180510.png
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> The community decided to work on a brand new website:
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201712.mbox/%3CCAHVHQx8odjgXMw%3DnhmSeDt0pYOUd0j%2BtkmMNtFnCJvHFcZwyEg%40mail.gmail.com%3E]
> The proposed technology is Jekyll but we have also to decide the website 
> template to use.
> [~kamaci] suggested the [Apache CloudStack|https://cloudstack.apache.org/] 
> template.
> [~molgun] proposed this approach:
>  # Find a modern new static site generator like Jekyll [1]
>  # Create a template
>  # Start to use it in a specific path like 
> [https://manifoldcf.apache.org/*new*]
>  # Migrate our Forrest xml's to Markdown (we can automate this somehow)
>  # Start to serve our new site on root path
> [1] [https://jekyllrb.com/docs/home/]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1495) Brand new website

2023-12-14 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796720#comment-17796720
 ] 

Karl Wright commented on CONNECTORS-1495:
-

Hi [~piergiorgioluc...@gmail.com], I see no actual HTML pages for your new 
site.  What am I missing?

To publish this, all that is needed is to:
(1) svn remove all directories and files under the svn pub/sub URL
(2) Copy the new site in
(3) Use svn add * to add all the directories and files
(4) svn commit publishes it.

The svn url is URL: https://svn.apache.org/repos/asf/manifoldcf/site/publish


> Brand new website
> -
>
> Key: CONNECTORS-1495
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1495
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: Site
>Affects Versions: ManifoldCF 2.9.1
>Reporter: Piergiorgio Lucidi
>Assignee: Piergiorgio Lucidi
>Priority: Major
> Fix For: ManifoldCF next
>
> Attachments: ManifoldCF-FluidoSkin.png, PDF-Rendition-1.png, 
> PDF-Rendition-2.png, Website - status - 20180510-2.png, Website - status - 
> 20180510.png
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> The community decided to work on a brand new website:
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201712.mbox/%3CCAHVHQx8odjgXMw%3DnhmSeDt0pYOUd0j%2BtkmMNtFnCJvHFcZwyEg%40mail.gmail.com%3E]
> The proposed technology is Jekyll but we have also to decide the website 
> template to use.
> [~kamaci] suggested the [Apache CloudStack|https://cloudstack.apache.org/] 
> template.
> [~molgun] proposed this approach:
>  # Find a modern new static site generator like Jekyll [1]
>  # Create a template
>  # Start to use it in a specific path like 
> [https://manifoldcf.apache.org/*new*]
>  # Migrate our Forrest xml's to Markdown (we can automate this somehow)
>  # Start to serve our new site on root path
> [1] [https://jekyllrb.com/docs/home/]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: JAXBContext - JDK 11

2023-12-05 Thread Karl Wright
It looks like jaxb is already included in connector-common.

These are the versions downloaded:





Karl


On Tue, Dec 5, 2023 at 10:18 AM Karl Wright  wrote:

> And Uwe: you also would need to add it to connector-build.xml.
> Karl
>
>
> On Tue, Dec 5, 2023 at 10:03 AM Karl Wright  wrote:
>
>> Have a look at the list of dependencies in framework/build.xml for
>> various contexts.  I believe jaxb-api is there already but possibly not
>> jaxb-impl.  The fix would then be to add jaxb-impl to every place we see
>> jaxb-api in that file.  Also, the download for jaxb is in the main
>> build.xml and it would have to include that jar as well.
>>
>> I'll look into this when I get done work.
>>
>> Karl
>>
>>
>> On Tue, Dec 5, 2023 at 6:11 AM Piergiorgio Lucidi 
>> wrote:
>>
>>> Hi Uwe,
>>>
>>> Thank you for sharing this issue and I think that the jaxb library is
>>> needed to solve this problem.
>>> I didn't tested it yet but probably adding these dependencies should
>>> solve
>>> it:
>>>
>>>- jaxb-core:2.3.1
>>>- jaxb-api:2.3.1
>>>- jaxb-impl:2.3.1
>>>
>>> I'm going to replicate the issue and raise a ticket on Jira about this.
>>> Thank you again and hope this helps ;)
>>>
>>> Cheers,
>>> PG
>>>
>>>
>>>
>>>
>>>
>>> Il giorno mar 5 dic 2023 alle ore 11:55 Wolfinger Uwe <
>>> uwe.wolfin...@oegk.at>
>>> ha scritto:
>>>
>>> > When trying to use the Generic Authority Connector
>>> >
>>> (org.apache.manifoldcf.authorities.authorities.generic.GenericAuthority) we
>>> > experienced some problems:
>>> >
>>> > Line 681 (context = JAXBContext.newInstance(Auth.class);) results in an
>>> > error:
>>> >
>>> > javax.xml.bind.JAXBException: Implementation of JAXB-API has not been
>>> > found on module path or classpath.
>>> > - with linked exception:
>>> > [java.lang.ClassNotFoundException:
>>> > com.sun.xml.internal.bind.v2.ContextFactory]
>>> > javax.xml.bind.JAXBException: Implementation of JAXB-API has not been
>>> > found on module path or classpath.
>>> > - with linked exception:
>>> > [java.lang.ClassNotFoundException:
>>> > com.sun.xml.internal.bind.v2.ContextFactory]
>>> > at
>>> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:232)
>>> > at javax.xml.bind.ContextFinder.find(ContextFinder.java:375)
>>> > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:691)
>>> > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:632)
>>> > at
>>> >
>>> org.apache.manifoldcf.authorities.authorities.generic.GenericAuthority$FetchTokensThread.run(GenericAuthority.java:717)
>>> > Caused by: java.lang.ClassNotFoundException:
>>> > com.sun.xml.internal.bind.v2.ContextFactory
>>> > at
>>> >
>>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1412)
>>> > at
>>> >
>>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1220)
>>> > at
>>> >
>>> javax.xml.bind.ServiceLoaderUtil.nullSafeLoadClass(ServiceLoaderUtil.java:92)
>>> > at
>>> >
>>> javax.xml.bind.ServiceLoaderUtil.safeLoadClass(ServiceLoaderUtil.java:125)
>>> > at
>>> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:230)
>>> > ... 4 more
>>> >
>>> > We are using JDK 11 as our runtime envirnoment. As far as i understood,
>>> > support for jaxb was removed in JDK 11. So my question is, does anybody
>>> > know a workaround for this problem (adding jaxb libs manually?) or are
>>> > there any plans to upgrade the GenericAuthority Connector to JDK 11
>>> > (jakarta.xml.bind).
>>> >
>>> > Kind Regards,
>>> > Uwe
>>> >
>>>
>>>
>>> --
>>> Piergiorgio
>>>
>>


Re: JAXBContext - JDK 11

2023-12-05 Thread Karl Wright
And Uwe: you also would need to add it to connector-build.xml.
Karl


On Tue, Dec 5, 2023 at 10:03 AM Karl Wright  wrote:

> Have a look at the list of dependencies in framework/build.xml for various
> contexts.  I believe jaxb-api is there already but possibly not jaxb-impl.
> The fix would then be to add jaxb-impl to every place we see jaxb-api in
> that file.  Also, the download for jaxb is in the main build.xml and it
> would have to include that jar as well.
>
> I'll look into this when I get done work.
>
> Karl
>
>
> On Tue, Dec 5, 2023 at 6:11 AM Piergiorgio Lucidi 
> wrote:
>
>> Hi Uwe,
>>
>> Thank you for sharing this issue and I think that the jaxb library is
>> needed to solve this problem.
>> I didn't tested it yet but probably adding these dependencies should solve
>> it:
>>
>>- jaxb-core:2.3.1
>>- jaxb-api:2.3.1
>>- jaxb-impl:2.3.1
>>
>> I'm going to replicate the issue and raise a ticket on Jira about this.
>> Thank you again and hope this helps ;)
>>
>> Cheers,
>> PG
>>
>>
>>
>>
>>
>> Il giorno mar 5 dic 2023 alle ore 11:55 Wolfinger Uwe <
>> uwe.wolfin...@oegk.at>
>> ha scritto:
>>
>> > When trying to use the Generic Authority Connector
>> >
>> (org.apache.manifoldcf.authorities.authorities.generic.GenericAuthority) we
>> > experienced some problems:
>> >
>> > Line 681 (context = JAXBContext.newInstance(Auth.class);) results in an
>> > error:
>> >
>> > javax.xml.bind.JAXBException: Implementation of JAXB-API has not been
>> > found on module path or classpath.
>> > - with linked exception:
>> > [java.lang.ClassNotFoundException:
>> > com.sun.xml.internal.bind.v2.ContextFactory]
>> > javax.xml.bind.JAXBException: Implementation of JAXB-API has not been
>> > found on module path or classpath.
>> > - with linked exception:
>> > [java.lang.ClassNotFoundException:
>> > com.sun.xml.internal.bind.v2.ContextFactory]
>> > at
>> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:232)
>> > at javax.xml.bind.ContextFinder.find(ContextFinder.java:375)
>> > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:691)
>> > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:632)
>> > at
>> >
>> org.apache.manifoldcf.authorities.authorities.generic.GenericAuthority$FetchTokensThread.run(GenericAuthority.java:717)
>> > Caused by: java.lang.ClassNotFoundException:
>> > com.sun.xml.internal.bind.v2.ContextFactory
>> > at
>> >
>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1412)
>> > at
>> >
>> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1220)
>> > at
>> >
>> javax.xml.bind.ServiceLoaderUtil.nullSafeLoadClass(ServiceLoaderUtil.java:92)
>> > at
>> >
>> javax.xml.bind.ServiceLoaderUtil.safeLoadClass(ServiceLoaderUtil.java:125)
>> > at
>> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:230)
>> > ... 4 more
>> >
>> > We are using JDK 11 as our runtime envirnoment. As far as i understood,
>> > support for jaxb was removed in JDK 11. So my question is, does anybody
>> > know a workaround for this problem (adding jaxb libs manually?) or are
>> > there any plans to upgrade the GenericAuthority Connector to JDK 11
>> > (jakarta.xml.bind).
>> >
>> > Kind Regards,
>> > Uwe
>> >
>>
>>
>> --
>> Piergiorgio
>>
>


Re: JAXBContext - JDK 11

2023-12-05 Thread Karl Wright
Have a look at the list of dependencies in framework/build.xml for various
contexts.  I believe jaxb-api is there already but possibly not jaxb-impl.
The fix would then be to add jaxb-impl to every place we see jaxb-api in
that file.  Also, the download for jaxb is in the main build.xml and it
would have to include that jar as well.

I'll look into this when I get done work.

Karl


On Tue, Dec 5, 2023 at 6:11 AM Piergiorgio Lucidi 
wrote:

> Hi Uwe,
>
> Thank you for sharing this issue and I think that the jaxb library is
> needed to solve this problem.
> I didn't tested it yet but probably adding these dependencies should solve
> it:
>
>- jaxb-core:2.3.1
>- jaxb-api:2.3.1
>- jaxb-impl:2.3.1
>
> I'm going to replicate the issue and raise a ticket on Jira about this.
> Thank you again and hope this helps ;)
>
> Cheers,
> PG
>
>
>
>
>
> Il giorno mar 5 dic 2023 alle ore 11:55 Wolfinger Uwe <
> uwe.wolfin...@oegk.at>
> ha scritto:
>
> > When trying to use the Generic Authority Connector
> > (org.apache.manifoldcf.authorities.authorities.generic.GenericAuthority)
> we
> > experienced some problems:
> >
> > Line 681 (context = JAXBContext.newInstance(Auth.class);) results in an
> > error:
> >
> > javax.xml.bind.JAXBException: Implementation of JAXB-API has not been
> > found on module path or classpath.
> > - with linked exception:
> > [java.lang.ClassNotFoundException:
> > com.sun.xml.internal.bind.v2.ContextFactory]
> > javax.xml.bind.JAXBException: Implementation of JAXB-API has not been
> > found on module path or classpath.
> > - with linked exception:
> > [java.lang.ClassNotFoundException:
> > com.sun.xml.internal.bind.v2.ContextFactory]
> > at
> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:232)
> > at javax.xml.bind.ContextFinder.find(ContextFinder.java:375)
> > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:691)
> > at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:632)
> > at
> >
> org.apache.manifoldcf.authorities.authorities.generic.GenericAuthority$FetchTokensThread.run(GenericAuthority.java:717)
> > Caused by: java.lang.ClassNotFoundException:
> > com.sun.xml.internal.bind.v2.ContextFactory
> > at
> >
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1412)
> > at
> >
> org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1220)
> > at
> >
> javax.xml.bind.ServiceLoaderUtil.nullSafeLoadClass(ServiceLoaderUtil.java:92)
> > at
> >
> javax.xml.bind.ServiceLoaderUtil.safeLoadClass(ServiceLoaderUtil.java:125)
> > at
> javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:230)
> > ... 4 more
> >
> > We are using JDK 11 as our runtime envirnoment. As far as i understood,
> > support for jaxb was removed in JDK 11. So my question is, does anybody
> > know a workaround for this problem (adding jaxb libs manually?) or are
> > there any plans to upgrade the GenericAuthority Connector to JDK 11
> > (jakarta.xml.bind).
> >
> > Kind Regards,
> > Uwe
> >
>
>
> --
> Piergiorgio
>


Re: HDFS Connector - Maven build issue found thanks to the ManifoldCF SDK

2023-11-25 Thread Karl Wright
Right, exclusions should work...
Karl


On Fri, Nov 24, 2023 at 5:26 PM Piergiorgio Lucidi 
wrote:

> Hi Karl,
>
> the problem is that the unique dependency declared in that pom.xml is the
> hadoop-common importing hadoop-annotations that is causing the issue.
> Below what I see in the hadoop-annotations pom:
>
> 
> 
>   os.linux
>   
> 
>   !Mac
> 
>   
>   
> 
>   jdk.tools
>   jdk.tools
>   1.6
>   system
>   ${java.home}/../lib/tools.jar
> 
>   
> 
> 
>   jdk1.7
>   
> 1.7
>   
>   
> 
>   jdk.tools
>   jdk.tools
>   1.7
>   system
>   ${java.home}/../lib/tools.jar
> 
>   
> 
>   
>
> Probably we could solve declaring hadoop-annotations dependency by adding
> an exclusion related to the jdk.tools dependency.
> Something like this:
>
> 
>   org.apache.hadoop
>   hadoop-annotations
>   ${hadoop.version}
>   
> 
>   jdk.tools
>   jdk.tools
>     
>   
> 
>
> I'll try to apply this using the ManifoldCF SDK and I'll let you know
>
> Cheers,
> PG
>
>
> Il giorno ven 24 nov 2023 alle ore 21:34 Karl Wright 
> ha scritto:
>
> > Hi - the jar it's looking for may no longer be part of the java 11 jdk.
> > I'm not exactly sure how to best handle this in Maven.  It may simply be
> > possible to remove the dependency entirely from the maven pom.
> >
> >
> > On Fri, Nov 24, 2023 at 12:34 PM Piergiorgio Lucidi <
> > piergior...@apache.org>
> > wrote:
> >
> > > I have just created this ticket:
> > > https://issues.apache.org/jira/browse/CONNECTORS-1751
> > >
> > > Il giorno ven 24 nov 2023 alle ore 18:19 Piergiorgio Lucidi <
> > > piergior...@apache.org> ha scritto:
> > >
> > > > Hi folks,
> > > >
> > > > Thanks to the ManifoldCF SDK, I found another issue with the Maven
> > build.
> > > > Steps to reproduce the issue with the Manifold SDK:
> > > >
> > > > 1. Clone the ManifoldCF SDK project from the following URL:
> > > > git clone https://github.com/OpenPj/manifoldcf-sdk.git
> > > >
> > > > 2. Remove row 27 from the run.sh script, this line is including the
> > > hotfix
> > > > for the hadoop version upgrading it to the latest version 3.3.6.
> > > >
> > > > 3. Run the following command in order to download the ManifoldCF
> source
> > > > code and run the Ant and Maven build process in a Docker container
> > based
> > > on
> > > > Maven 3.9.5 and OpenJDK Temurin 11This in order to install locally
> all
> > > the
> > > > Maven dependencies needed to implement custom extensions /
> connectors.
> > > The
> > > > Docker Volume includes the entire Maven repo used by the ManifoldCF
> > build
> > > > process and it will be copied in the SDK target folder and configured
> > in
> > > > order to compile your Java Custom code:
> > > >
> > > > ./run.sh init 2.26 ga
> > > >
> > > > The current version of the SDK includes the sep instruction to fix
> all
> > > the
> > > > build problems described here:
> > > > https://issues.apache.org/jira/browse/CONNECTORS-1750
> > > >
> > > > These issues are also resolved in ManifoldCF main trunk but still
> > present
> > > > in the latest previous release packages (source code packages).
> > > > I have to confess that I don't know if this issue is also included in
> > > > other releases.
> > > > The SDK is returning the following error that should be related to
> the
> > > > Hadoop Annotations dependencies of Hadoop 2.6.0, a very old version
> of
> > > > Hadoop that includes a JDK 1.6 dependency:
> > jdk.tools:jdk.tools:jar:1.6.
> > > >
> > > > [ERROR] Failed to execute goal on project mcf-hdfs-connector: Could
> not
> > > > resolve dependencies for project
> > > > org.apache.manifoldcf:mcf-hdfs-connector:jar:2.26: The following
> > > artifacts
> > > > could not be resolved: jdk.tools:jdk.tools:jar:1.6: Could not find
> > > artifact
> > > > jdk.tools:jdk.tools:jar:1.6 at specified path
> > > > /opt/java/openjdk/../lib/tools.jar -> [Help 1]
> > > 

Re: HDFS Connector - Maven build issue found thanks to the ManifoldCF SDK

2023-11-24 Thread Karl Wright
Hi - the jar it's looking for may no longer be part of the java 11 jdk.
I'm not exactly sure how to best handle this in Maven.  It may simply be
possible to remove the dependency entirely from the maven pom.


On Fri, Nov 24, 2023 at 12:34 PM Piergiorgio Lucidi 
wrote:

> I have just created this ticket:
> https://issues.apache.org/jira/browse/CONNECTORS-1751
>
> Il giorno ven 24 nov 2023 alle ore 18:19 Piergiorgio Lucidi <
> piergior...@apache.org> ha scritto:
>
> > Hi folks,
> >
> > Thanks to the ManifoldCF SDK, I found another issue with the Maven build.
> > Steps to reproduce the issue with the Manifold SDK:
> >
> > 1. Clone the ManifoldCF SDK project from the following URL:
> > git clone https://github.com/OpenPj/manifoldcf-sdk.git
> >
> > 2. Remove row 27 from the run.sh script, this line is including the
> hotfix
> > for the hadoop version upgrading it to the latest version 3.3.6.
> >
> > 3. Run the following command in order to download the ManifoldCF source
> > code and run the Ant and Maven build process in a Docker container based
> on
> > Maven 3.9.5 and OpenJDK Temurin 11This in order to install locally all
> the
> > Maven dependencies needed to implement custom extensions / connectors.
> The
> > Docker Volume includes the entire Maven repo used by the ManifoldCF build
> > process and it will be copied in the SDK target folder and configured in
> > order to compile your Java Custom code:
> >
> > ./run.sh init 2.26 ga
> >
> > The current version of the SDK includes the sep instruction to fix all
> the
> > build problems described here:
> > https://issues.apache.org/jira/browse/CONNECTORS-1750
> >
> > These issues are also resolved in ManifoldCF main trunk but still present
> > in the latest previous release packages (source code packages).
> > I have to confess that I don't know if this issue is also included in
> > other releases.
> > The SDK is returning the following error that should be related to the
> > Hadoop Annotations dependencies of Hadoop 2.6.0, a very old version of
> > Hadoop that includes a JDK 1.6 dependency:  jdk.tools:jdk.tools:jar:1.6.
> >
> > [ERROR] Failed to execute goal on project mcf-hdfs-connector: Could not
> > resolve dependencies for project
> > org.apache.manifoldcf:mcf-hdfs-connector:jar:2.26: The following
> artifacts
> > could not be resolved: jdk.tools:jdk.tools:jar:1.6: Could not find
> artifact
> > jdk.tools:jdk.tools:jar:1.6 at specified path
> > /opt/java/openjdk/../lib/tools.jar -> [Help 1]
> > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
> > goal on project mcf-hdfs-connector: Could not resolve dependencies for
> > project org.apache.manifoldcf:mcf-hdfs-connector:jar:2.26: The following
> > artifacts could not be resolved: jdk.tools:jdk.tools:jar:1.6: Could not
> > find artifact jdk.tools:jdk.tools:jar:1.6 at specified path
> > /opt/java/openjdk/../lib/tools.jar
> > at
> >
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies
> > (LifecycleDependencyResolver.java:243)
> > at
> >
> org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies
> > (LifecycleDependencyResolver.java:136)
> > at
> >
> org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved
> > (MojoExecutor.java:355)
> > at org.apache.maven.lifecycle.internal.MojoExecutor.doExecute
> > (MojoExecutor.java:313)
> > at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> > (MojoExecutor.java:212)
> > at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> > (MojoExecutor.java:174)
> > at org.apache.maven.lifecycle.internal.MojoExecutor.access$000
> > (MojoExecutor.java:75)
> > at org.apache.maven.lifecycle.internal.MojoExecutor$1.run
> > (MojoExecutor.java:162)
> > at org.apache.maven.plugin.DefaultMojosExecutionStrategy.execute
> > (DefaultMojosExecutionStrategy.java:39)
> > at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> > (MojoExecutor.java:159)
> > at
> > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> > (LifecycleModuleBuilder.java:105)
> > at
> >
> org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call
> > (MultiThreadedBuilder.java:193)
> > at
> >
> org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call
> > (MultiThreadedBuilder.java:180)
> > at java.util.concurrent.FutureTask.run (FutureTask.java:264)
> > at java.util.concurrent.Executors$RunnableAdapter.call
> > (Executors.java:515)
> > at java.util.concurrent.FutureTask.run (FutureTask.java:264)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker
> > (ThreadPoolExecutor.java:1128)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run
> > (ThreadPoolExecutor.java:628)
> > at java.lang.Thread.run (Thread.java:829)
> > . . .
> > Caused by: org.eclipse.aether.transfer.ArtifactNotFoundException: Could
> > not find artifact 

Re: MCF Postgres upgrade to 15.4

2023-11-17 Thread Karl Wright
Generally, Postgresql is pretty stable, but you would want to update the
JDBC jar for postgresql as well.
Karl


On Fri, Nov 17, 2023 at 6:19 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hi all,
>
> For what it’s worth, we have upgraded Postgresql to version 15.4 for the
> MCF 2.26 that is embedded in Datafari, as part of our work towards
> Datafari 6.0
>
> We have run our Datafari tests, and we have not identified any
> particular issues, so it seems that MCF 2.26 is compatible with it. Note
> that we have not used any of the embedded testings that are within MCF,
> so you may want to test that before certifying that it is 100%
> compatible, but as far as we are concerned it works like a charm.
>
> @Mingchun , since you are an expert in Postgresql, would you have some
> time to look at the optimization parameters of Postgresql for MCF? It’s
> been quite a while since those have been done for MCF, (probably dating
> back to Postgresql 9.x), and it is highly possible that new parameters
> have appeared or have changed.
>
>
> Regards,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com 
>


2.26 release documentation did not build

2023-11-01 Thread Karl Wright
Reason: the switch to java 11 meant that Forrest did not compile.
Apparently the Java 11 compiler will no longer handle the earlier source
versions specified in the Forrest build.

I'm still going to update the website, but we're going to also now need to
invest in perhaps updating which Forrest version we take.  This is not
trivial because I had to make customizations to make the PDF generator used
the fonts I downloaded for this, and IIRC Forrest later changed in a way
which broke my customizations.

Karl


[RESULT][VOTE] Release Apache ManifoldCF 2.26, RC1

2023-11-01 Thread Karl Wright
Three +1's, >72 hours.  Vote passes!

Karl


On Wed, Nov 1, 2023 at 12:47 PM Karl Wright  wrote:

> +1 from me.
> Karl
>
>
> On Sun, Oct 29, 2023 at 4:47 AM Furkan KAMACI 
> wrote:
>
>> +1
>>
>> On Sun, Oct 29, 2023 at 2:00 AM Mingchun Zhao 
>> wrote:
>>
>> > +1
>> >
>> > Built and tested from tag release-2.26-RC1
>> > <http://svn.apache.org/repos/asf/manifoldcf/tags/release-2.26-RC1> with
>> > Ant.
>> >
>> > OS name: macOS 14.0
>> > Apache Ant(TM) version 1.10.0
>> > Java version: openjdk version "11.0.11"
>> > locale: en_US.UTF-8
>> >
>> > Thanks for doing the release!
>> >
>> > Regards,
>> > Mingchun
>> >
>> > 2023年10月29日(日) 1:12 Karl Wright :
>> >
>> > > Please vote on whether to release Apache ManifoldCF 2.26, RC1.  The
>> > release
>> > > candidate can be found here:
>> > >
>> > >
>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26
>> > >
>> > > There is also a release tag at:
>> > >
>> > > https//svn.apache.org/repos/asf/manifoldcf/tags/release-2.26-RC1
>> > >
>> > > Thanks to Mingchun Zhao and Guylaine Bassette for making this release
>> > > possible!
>> > >
>> > > Karl
>> > >
>> >
>>
>


Re: [VOTE] Release Apache ManifoldCF 2.26, RC1

2023-11-01 Thread Karl Wright
+1 from me.
Karl


On Sun, Oct 29, 2023 at 4:47 AM Furkan KAMACI 
wrote:

> +1
>
> On Sun, Oct 29, 2023 at 2:00 AM Mingchun Zhao 
> wrote:
>
> > +1
> >
> > Built and tested from tag release-2.26-RC1
> > <http://svn.apache.org/repos/asf/manifoldcf/tags/release-2.26-RC1> with
> > Ant.
> >
> > OS name: macOS 14.0
> > Apache Ant(TM) version 1.10.0
> > Java version: openjdk version "11.0.11"
> > locale: en_US.UTF-8
> >
> > Thanks for doing the release!
> >
> > Regards,
> > Mingchun
> >
> > 2023年10月29日(日) 1:12 Karl Wright :
> >
> > > Please vote on whether to release Apache ManifoldCF 2.26, RC1.  The
> > release
> > > candidate can be found here:
> > >
> > >
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26
> > >
> > > There is also a release tag at:
> > >
> > > https//svn.apache.org/repos/asf/manifoldcf/tags/release-2.26-RC1
> > >
> > > Thanks to Mingchun Zhao and Guylaine Bassette for making this release
> > > possible!
> > >
> > > Karl
> > >
> >
>


Re: [PR] Fix junit test failure with Solr 9.x Output connector [manifoldcf]

2023-10-28 Thread Karl Wright
This has been integrated and a new release spun.  Thank you!
Karl


On Sat, Oct 28, 2023 at 9:11 AM mingchun-zhao (via GitHub) 
wrote:

>
> mingchun-zhao opened a new pull request, #157:
> URL: https://github.com/apache/manifoldcf/pull/157
>
>In order to resolve junit test failure with Solr 9.x Output connector,
> I modified MockSolrService to support HTTP2C.
>
>I confirmed that all test cases of ant test passed.
>
>```
>~manifoldcf% ant test
>... ...
>test:
>BUILD SUCCESSFUL
>Total time: 51 minutes 37 seconds
>```
>
>
> --
> This is an automated message from the Apache Git Service.
> To respond to the message, please log on to GitHub and use the
> URL above to go to the specific comment.
>
> To unsubscribe, e-mail: dev-unsubscr...@manifoldcf.apache.org
>
> For queries about this service, please contact Infrastructure at:
> us...@infra.apache.org
>
>


[VOTE] Release Apache ManifoldCF 2.26, RC1

2023-10-28 Thread Karl Wright
Please vote on whether to release Apache ManifoldCF 2.26, RC1.  The release
candidate can be found here:

https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26

There is also a release tag at:

https//svn.apache.org/repos/asf/manifoldcf/tags/release-2.26-RC1

Thanks to Mingchun Zhao and Guylaine Bassette for making this release
possible!

Karl


Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-27 Thread Karl Wright
That is indeed reassuring.  It probably means that the test needs some
changes, is all.  But we cannot be sure the newer Zookeeper wouldn't mess
things up without having the test be successful at least with the older
zookeeper.

Karl


On Fri, Oct 27, 2023 at 7:32 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Sorry for not executing all of the tests: I'm quite new to the MCF
> project, and I focused only on the core-framework tests as it was the
> key aspect highlighted in the related jira issue. I didn't know I had to
> run other tests such as the IT ones. It is to be noted that we tested
> the MCF with Solr 9 embedded in our Datafari, and after quite some
> indexing, we have seen no problems at all, which sounds quite reassuring
>
> Le 27/10/2023 à 12:49, Karl Wright a écrit :
> > Okay, well I wouldnt have approved the upgrade had I known that the tests
> > didn't pass!  So we need to understand the problem as soon as possible.
> >
> > Karl
> >
> >
> > On Fri, Oct 27, 2023 at 2:50 AM Guylaine BASSETTE <
> > guylaine.basse...@francelabs.com> wrote:
> >
> >> Hello all,
> >>
> >> Sadly, the error remains the same even with the previous Zookeeper
> >> version (3.8.0). Actually, I'm not able to pass the test since we have
> >> passed to Solr 9.
> >>
> >> Le 27/10/2023 à 01:51, Karl Wright a écrit :
> >>> It is possible that Solr needs the older version of Zookeeper. If you
> >>> swap out the current one and replace it with the one the version of
> >>> SolrJ we use references, does the test pass then? If it does, we're
> >>> going to have to figure out how to address the fact that we have two
> >>> connectors that each depend on a different version of zookeeper. But
> >>> first please let me know if it works. I'll suggest a way of
> >>> reconciling these once I know.
> >> --
> >> Cordialement,
> >> Guylaine
> >>
> >> France Labs – Your knowledge, now
> >> Datafari Enterprise Search – Découvrez la version 5 / Discover our
> version
> >> 5
> >> www.datafari.com  <http://www.datafari.com>
> --
> Cordialement,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com <http://www.datafari.com>


Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-27 Thread Karl Wright
Okay, well I wouldnt have approved the upgrade had I known that the tests
didn't pass!  So we need to understand the problem as soon as possible.

Karl


On Fri, Oct 27, 2023 at 2:50 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hello all,
>
> Sadly, the error remains the same even with the previous Zookeeper
> version (3.8.0). Actually, I'm not able to pass the test since we have
> passed to Solr 9.
>
> Le 27/10/2023 à 01:51, Karl Wright a écrit :
> > It is possible that Solr needs the older version of Zookeeper. If you
> > swap out the current one and replace it with the one the version of
> > SolrJ we use references, does the test pass then? If it does, we're
> > going to have to figure out how to address the fact that we have two
> > connectors that each depend on a different version of zookeeper. But
> > first please let me know if it works. I'll suggest a way of
> > reconciling these once I know.
> --
> Cordialement,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com <http://www.datafari.com>


Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-26 Thread Karl Wright
h based on the latest track? Unfortunately, I was unable
> to
> > >> apply your patch in my environment.
> > >>
> > >> 2023年10月26日(木) 20:44 Guylaine BASSETTE<
> guylaine.basse...@francelabs.com
> > >:
> > >>
> > >>> Hello Mingchun,
> > >>>
> > >>> As mentioned in my previous e-mail to Karl, my patch contained other
> > >>> files to be updated. I don't know why they don't have been taken into
> > >>> account, but would you be interested in trying my patch? I have fixed
> > the
> > >>> spots I had missed... (joint with this mail)
> > >>>
> > >>> With these modifications I went through all tests of kafka.
> > >>> Le 26/10/2023 à 03:47, Mingchun Zhao a écrit :
> > >>>
> > >>> Hi there, Allow me to share my `ant test` result using the latest
> > Github
> > >>> trunk.
> > >>> I got a lot of `Broker may not be available` warnings in the Kafka IT
> > test
> > >>> and it eventually failed.
> > >>>
> > >>> ```
> > >>> ...
> > >>>  [junit] [kafka-producer-network-thread | producer-3] INFO
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-3]
> > >>> Node 0 disconnected.
> > >>>  [junit] [kafka-producer-network-thread | producer-3] WARN
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-3]
> > >>> Connection to node 0 (/192.168.10.103:9092) could not be
> established.
> > >>> Broker may not be available.
> > >>>  [junit] [kafka-producer-network-thread | producer-1] INFO
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-1]
> > >>> Node 0 disconnected.
> > >>>  [junit] [kafka-producer-network-thread | producer-1] WARN
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-1]
> > >>> Connection to node 0 (/192.168.10.103:9092) could not be
> established.
> > >>> Broker may not be available.
> > >>>  [junit] [kafka-producer-network-thread | producer-2] INFO
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-2]
> > >>> Node 0 disconnected.
> > >>>  [junit] [kafka-producer-network-thread | producer-2] WARN
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-2]
> > >>> Connection to node 0 (/192.168.10.103:9092) could not be
> established.
> > >>> Broker may not be available.
> > >>>  [junit] [kafka-producer-network-thread | producer-3] INFO
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-3]
> > >>> Node 0 disconnected.
> > >>>  [junit] [kafka-producer-network-thread | producer-3] WARN
> > >>> org.apache.kafka.clients.NetworkClient - [Producer
> clientId=producer-3]
> > >>> Connection to node 0 (/192.168.10.103:9092) could not be
> established.
> > >>> Broker may not be available.
> > >>>  [junit] -  ---
> > >>>  [junit] Testcase:
> > >>>
> > sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
> > >>> Caused an ERROR
> > >>>  [junit] ManifoldCF did not terminate in the allotted time of
> > 12
> > >>> milliseconds
> > >>>  [junit]
> org.apache.manifoldcf.core.interfaces.ManifoldCFException:
> > >>> ManifoldCF did not terminate in the allotted time of 12
> > milliseconds
> > >>>  [junit] at
> > >>>
> >
> org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT.waitJobInactive(APISanityHSQLDBIT.java:289)
> > >>>  [junit] at
> > >>>
> >
> org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT.sanityCheck(APISanityHSQLDBIT.java:177)
> > >>>  [junit] at
> > >>>
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> > >>> Method)
> > >>>  [junit] at
> > >>>
> >
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > >>>  [junit] at
> > >>>
> >
> java.base/jdk.internal.reflect.Delegatin

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-25 Thread Karl Wright
I see you have committed this.  You missed a few spots; the extra jar was
mentioned in multiple places.  I committed another fix to correct that.

Karl


On Wed, Oct 25, 2023 at 10:46 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hello Karl,
>
> Thank you very much for this update! I have tested your suggestions and
> kafka ITs tests ended successfully. :-)
>
> Here you can find the patch.
>
> My only doubt is this warning I had in some parts of the kafka test:
>
> ```
>
> [junit] [Controller-0-to-broker-0-send-thread] INFO
> org.apache.kafka.clients.NetworkClient - [Controller id=0,
> targetBrokerId=0] Node 0 disconnected.
> [junit] [Controller-0-to-broker-0-send-thread] WARN
> org.apache.kafka.clients.NetworkClient - [Controller id=0,
> targetBrokerId=0] Connection to node 0 (guylaine-virtual-machine/
> 127.0.1.1:9092) could not be established. Broker may not be available.
> [junit] [Controller-0-to-broker-0-send-thread] WARN
> kafka.controller.RequestSendThread - [RequestSendThread controllerId=0]
> Controller 0's connection to broker guylaine-virtual-machine:9092 (id: 0
> rack: null) was unsuccessful
> [junit] java.io.IOException: Connection to
> guylaine-virtual-machine:9092 (id: 0 rack: null) failed.
> [junit] at
> org.apache.kafka.clients.NetworkClientUtils.awaitReady(NetworkClientUtils.java:70)
> [junit] at
> kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:298)
> [junit] at
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:251)
> [junit] at
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:130)
> [junit] [Controller-0-to-broker-0-send-thread] INFO
> org.apache.kafka.clients.NetworkClient - [Controller id=0,
> targetBrokerId=0] Client requested connection close from node 0
> [junit] [controller-event-thread] INFO state.change.logger -
> [Controller id=0 epoch=2] Sending LeaderAndIsr request to broker 0 with 1
> become-leader and 0 become-follower partitions
> ```
>
> I have also ran an "ant test". It run core-framework and ITs tests until
> mongoDB connector with this fail:
> ```
> [junit] Testcase:
> sanityCheck(org.apache.manifoldcf.agents.output.mongodboutput.tests.APISanityHSQLDBIT):
> Caused an ERROR
> [junit] Could not start process: 
> [junit] java.lang.RuntimeException: Could not start process: 
> [junit] at
> de.flapdoodle.embed.mongo.AbstractMongoProcess.onAfterProcessStart(AbstractMongoProcess.java:81)
> [junit] at
> de.flapdoodle.embed.process.runtime.AbstractProcess.(AbstractProcess.java:115)
> [junit] at
> de.flapdoodle.embed.mongo.AbstractMongoProcess.(AbstractMongoProcess.java:54)
> [junit] at
> de.flapdoodle.embed.mongo.MongodProcess.(MongodProcess.java:50)
> [junit] at
> de.flapdoodle.embed.mongo.MongodExecutable.start(MongodExecutable.java:44)
> [junit] at
> de.flapdoodle.embed.mongo.MongodExecutable.start(MongodExecutable.java:34)
> [junit] at
> de.flapdoodle.embed.process.runtime.Executable.start(Executable.java:109)
> [junit] at
> org.apache.manifoldcf.agents.output.mongodboutput.tests.BaseITHSQLDB.setUpMongoDB(BaseITHSQLDB.java:72)
> [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit] at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit]
> [junit]
> [junit] Testcase:
> sanityCheck(org.apache.manifoldcf.agents.output.mongodboutput.tests.APISanityHSQLDBIT):
> Caused an ERROR
> [junit] null
> [junit] java.lang.NullPointerException
> [junit] at
> org.apache.manifoldcf.agents.output.mongodboutput.tests.APISanityHSQLDBIT.removeTestArea(APISanityHSQLDBIT.java:109)
> [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit] at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> ```
>
> And no more success for the ITs tests of Solr connector...
>
> Guylaine
> Le 25/10/2023 à 12:52, Karl Wright a écrit :
>
> I was able to reproduce the problem last night.  I believe the cause may
> well be that we've moved too many dependencies to the framework level.
> Specifically, I think perhaps only zookeeper and netty should run there,
> but th

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-25 Thread Karl Wright
I was able to reproduce the problem last night.  I believe the cause may
well be that we've moved too many dependencies to the framework level.
Specifically, I think perhaps only zookeeper and netty should run there,
but the Scala library probably needs to run at the same classloader level
as the other Scala jars, so it should be moved back into connectors/kafka
and removed from framework/build.xml and from build.xml and from
connector-build.xml.

Sadly I'm completely snowed under until the weekend so it will need to wait
until then unless someone else wants to try this.

Karl


On Tue, Oct 24, 2023 at 9:12 AM Karl Wright  wrote:

> Try doing svn update and deleting your test-materials directory contents.
> Then ant download-dependencies.  You don't get a link error after that when
> you do run-IT-HSQLDB .
>
> Karl
>
> On Tue, Oct 24, 2023 at 9:10 AM Guylaine BASSETTE <
> guylaine.basse...@francelabs.com> wrote:
>
>> Hi Karl and Mingchun,
>>
>> Thanks again !
>>
>> My bad for the Zookeeper dependencies. Actually I've made a mistake
>> using in my IDE dependencies analyzer.
>>
>> Regarding Kafka tests, a light search brings me to the
>> "spark-streaming-kafka" dependency that might be missing...
>>
>> At the mean time, I continue my effort on Solr connector tests.
>>
>>
>> For any use, here is the errors I get:
>>
>> ```
>>
>>[junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Stopped o.e.j.w.WebAppContext@7c28c1{ManifoldCF General API
>>
>> Webapp,/mcf-api-service,null,STOPPED}{/home/guylaine/IdeaProjects/mon-manifoldcf/dist/web/war/mcf-api-service.war}
>>  [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler
>> - Stopped o.e.j.w.WebAppContext@588ffeb{ManifoldCF Authorities API
>>
>> Webapp,/mcf-authority-service,null,STOPPED}{/home/guylaine/IdeaProjects/mon-manifoldcf/dist/web/war/mcf-authority-service.war}
>>  [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler
>> - Stopped o.e.j.w.WebAppContext@71a3a190{ManifoldCF Crawler
>>
>> Interface,/mcf-crawler-ui,null,STOPPED}{/home/guylaine/IdeaProjects/mon-manifoldcf/dist/web/war/mcf-crawler-ui.war}
>>  [junit] -  ---
>>  [junit] Testcase:
>> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
>> Caused an ERROR
>>  [junit] 'scala.collection.immutable.ArraySeq
>> scala.runtime.ScalaRunTime$.wrapRefArray(java.lang.Object[])'
>>  [junit] java.lang.NoSuchMethodError:
>> 'scala.collection.immutable.ArraySeq
>> scala.runtime.ScalaRunTime$.wrapRefArray(java.lang.Object[])'
>>  [junit] at
>> kafka.server.KafkaConfig$.(KafkaConfig.scala:338)
>>  [junit] at
>> kafka.server.KafkaConfig.(KafkaConfig.scala:1603)
>>  [junit] at
>>
>> org.apache.manifoldcf.agents.output.kafka.KafkaLocal.(KafkaLocal.java:31)
>>  [junit] at
>>
>> org.apache.manifoldcf.agents.output.kafka.BaseITHSQLDB.setupKafka(BaseITHSQLDB.java:86)
>>  [junit] at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>>  [junit] at
>>
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>  [junit] at
>>
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>  [junit]
>>  [junit]
>>  [junit] Testcase:
>> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
>> Caused an ERROR
>>  [junit] null
>>  [junit] java.lang.NullPointerException
>>  [junit] at
>>
>> org.apache.manifoldcf.agents.output.kafka.BaseITHSQLDB.cleanUpKafka(BaseITHSQLDB.java:92)
>>  [junit] at
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>>  [junit] at
>>
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>  [junit] at
>>
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>
>> ```
>>
>> Le 24/10/2023 à 12:30, Karl Wright a écrit :
>> > I missed a place - connector-build.xml.  Updated now.
>> >
>> > Now we don't get a link exception, but neither does the kafka test work.
>> > It seems to be unable to start zookeeper even though all the
>> dependencies
>> > are now there.  

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-24 Thread Karl Wright
Try doing svn update and deleting your test-materials directory contents.
Then ant download-dependencies.  You don't get a link error after that when
you do run-IT-HSQLDB .

Karl

On Tue, Oct 24, 2023 at 9:10 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hi Karl and Mingchun,
>
> Thanks again !
>
> My bad for the Zookeeper dependencies. Actually I've made a mistake
> using in my IDE dependencies analyzer.
>
> Regarding Kafka tests, a light search brings me to the
> "spark-streaming-kafka" dependency that might be missing...
>
> At the mean time, I continue my effort on Solr connector tests.
>
>
> For any use, here is the errors I get:
>
> ```
>
>[junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Stopped o.e.j.w.WebAppContext@7c28c1{ManifoldCF General API
>
> Webapp,/mcf-api-service,null,STOPPED}{/home/guylaine/IdeaProjects/mon-manifoldcf/dist/web/war/mcf-api-service.war}
>  [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler
> - Stopped o.e.j.w.WebAppContext@588ffeb{ManifoldCF Authorities API
>
> Webapp,/mcf-authority-service,null,STOPPED}{/home/guylaine/IdeaProjects/mon-manifoldcf/dist/web/war/mcf-authority-service.war}
>  [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler
> - Stopped o.e.j.w.WebAppContext@71a3a190{ManifoldCF Crawler
>
> Interface,/mcf-crawler-ui,null,STOPPED}{/home/guylaine/IdeaProjects/mon-manifoldcf/dist/web/war/mcf-crawler-ui.war}
>  [junit] -  ---
>  [junit] Testcase:
> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
> Caused an ERROR
>  [junit] 'scala.collection.immutable.ArraySeq
> scala.runtime.ScalaRunTime$.wrapRefArray(java.lang.Object[])'
>  [junit] java.lang.NoSuchMethodError:
> 'scala.collection.immutable.ArraySeq
> scala.runtime.ScalaRunTime$.wrapRefArray(java.lang.Object[])'
>  [junit] at
> kafka.server.KafkaConfig$.(KafkaConfig.scala:338)
>  [junit] at kafka.server.KafkaConfig.(KafkaConfig.scala:1603)
>  [junit] at
>
> org.apache.manifoldcf.agents.output.kafka.KafkaLocal.(KafkaLocal.java:31)
>  [junit] at
>
> org.apache.manifoldcf.agents.output.kafka.BaseITHSQLDB.setupKafka(BaseITHSQLDB.java:86)
>  [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>  [junit] at
>
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  [junit]
>  [junit]
>  [junit] Testcase:
> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
> Caused an ERROR
>  [junit] null
>  [junit] java.lang.NullPointerException
>  [junit] at
>
> org.apache.manifoldcf.agents.output.kafka.BaseITHSQLDB.cleanUpKafka(BaseITHSQLDB.java:92)
>  [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>  [junit] at
>
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> ```
>
> Le 24/10/2023 à 12:30, Karl Wright a écrit :
> > I missed a place - connector-build.xml.  Updated now.
> >
> > Now we don't get a link exception, but neither does the kafka test work.
> > It seems to be unable to start zookeeper even though all the dependencies
> > are now there.  Will need to look at this after work.
> >
> > Karl
> >
> >
> > On Mon, Oct 23, 2023 at 11:32 PM Mingchun Zhao >
> > wrote:
> >
> >> Thanks. I've tried `ant test` with the latest trunk. As a result, the
> kafka
> >> test failed as below.
> >> ```
> >>  [junit] Testcase:
> >>
> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
> >> Caused an ERROR
> >>  [junit] io/netty/handler/ssl/SslContext
> >>  [junit] java.lang.NoClassDefFoundError:
> io/netty/handler/ssl/SslContext
> >>  [junit] at
> >>
> >>
> org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
> >>  [junit] at
> >>
> >>
> org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
> >>  [junit] at
> >> org.apache.zookeeper.common.ZKConfig.ini

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-24 Thread Karl Wright
I missed a place - connector-build.xml.  Updated now.

Now we don't get a link exception, but neither does the kafka test work.
It seems to be unable to start zookeeper even though all the dependencies
are now there.  Will need to look at this after work.

Karl


On Mon, Oct 23, 2023 at 11:32 PM Mingchun Zhao 
wrote:

> Thanks. I've tried `ant test` with the latest trunk. As a result, the kafka
> test failed as below.
> ```
> [junit] Testcase:
> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
> Caused an ERROR
> [junit] io/netty/handler/ssl/SslContext
> [junit] java.lang.NoClassDefFoundError: io/netty/handler/ssl/SslContext
> [junit] at
>
> org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
> [junit] at
>
> org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
> [junit] at
> org.apache.zookeeper.common.ZKConfig.init(ZKConfig.java:92)
> [junit] at
> org.apache.zookeeper.common.ZKConfig.(ZKConfig.java:61)
> [junit] at
> org.apache.zookeeper.client.ZKClientConfig.(ZKClientConfig.java:69)
> [junit] at kafka.server.KafkaConfig.(KafkaConfig.scala:1620)
> [junit] at kafka.server.KafkaConfig.(KafkaConfig.scala:1603)
> [junit] at
>
> org.apache.manifoldcf.agents.output.kafka.KafkaLocal.(KafkaLocal.java:31)
> [junit] at
>
> org.apache.manifoldcf.agents.output.kafka.BaseITHSQLDB.setupKafka(BaseITHSQLDB.java:86)
> [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> [junit] at
>
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit] Caused by: java.lang.ClassNotFoundException:
> io.netty.handler.ssl.SslContext
> [junit] at
>
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> [junit] at
>
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> [junit] at
> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
> [junit]
> [junit]
> [junit] Testcase:
> sanityCheck(org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT):
> Caused an ERROR
> [junit] null
> [junit] java.lang.NullPointerException
> [junit] at
>
> org.apache.manifoldcf.agents.output.kafka.BaseITHSQLDB.cleanUpKafka(BaseITHSQLDB.java:92)
> [junit] at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> [junit] at
>
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit]
> [junit]
>
> BUILD FAILED
> /Users/zhaomingchun/ManifoldCF/manifoldcf/build.xml:517: The following
> error occurred while executing this line:
> /Users/zhaomingchun/ManifoldCF/manifoldcf/build.xml:471: The following
> error occurred while executing this line:
> /Users/zhaomingchun/ManifoldCF/manifoldcf/dist/connector-build.xml:1102:
> Test org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT failed
> ```
>
> 2023年10月24日(火) 11:00 Karl Wright :
>
> > Okay, I updated zookeeper properly in build.xml and framework/build.xml,
> > with the two new dependencies, and the zookeeper tests pass.  I haven't
> > tried the kafka or solr tests yet.
> >
> > Karl
> >
> >
> > On Mon, Oct 23, 2023 at 9:29 PM Karl Wright  wrote:
> >
> > > Unless I know what kafka is using zookeeper for, this would seem risky
> to
> > > me.  Zookeeper is meant to coordinate processes; it may not work for
> one
> > > process to be using different versions of zookeeper than the others.
> > >
> > > It looks like the original change to kafka you reverted had the proper
> > > dependencies but they absolutely needed to be included in the right
> > > classpaths and they weren't - they were only included in the kafka
> tests.
> > > I will look at this perhaps at the latest this weekend, but I won't
> > commit
> > > this patch.
> > >
> > >
> > > Karl
> > >
> > >
> > > On Mon, Oct 23, 2023 at 5:14 PM Mingchun Zhao <
> mingchun.zha...@gmail.com
> > >
> > > wrote:
> > >
> > >> I reverted zookeeper version to 3.8.0 to avoid linkage error on the
>

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
Okay, I updated zookeeper properly in build.xml and framework/build.xml,
with the two new dependencies, and the zookeeper tests pass.  I haven't
tried the kafka or solr tests yet.

Karl


On Mon, Oct 23, 2023 at 9:29 PM Karl Wright  wrote:

> Unless I know what kafka is using zookeeper for, this would seem risky to
> me.  Zookeeper is meant to coordinate processes; it may not work for one
> process to be using different versions of zookeeper than the others.
>
> It looks like the original change to kafka you reverted had the proper
> dependencies but they absolutely needed to be included in the right
> classpaths and they weren't - they were only included in the kafka tests.
> I will look at this perhaps at the latest this weekend, but I won't commit
> this patch.
>
>
> Karl
>
>
> On Mon, Oct 23, 2023 at 5:14 PM Mingchun Zhao 
> wrote:
>
>> I reverted zookeeper version to 3.8.0 to avoid linkage error on the
>> multiThreadZooKeeperLockTest:
>> [junit] Caused by: java.lang.ClassNotFoundException:
>> io.netty.handler.ssl.SslContext
>>
>> I've prepared a PR here:
>> https://github.com/apache/manifoldcf/pull/156
>>
>> Just a heads up, `ant test` still hangs on the Solr Output connector test:
>> ```
>> run-IT-HSQLDB:
>> [junit] Testsuite:
>> org.apache.manifoldcf.agents.output.solr.tests.SolrCrawlHSQLDBIT
>> [junit] Configuration file successfully read
>> [junit] [main] INFO org.eclipse.jetty.util.log - Logging initialized
>> @7027ms to org.eclipse.jetty.util.log.Slf4jLog
>> [junit] [main] INFO org.eclipse.jetty.server.Server -
>> jetty-9.4.48.v20220622; built: 2022-06-21T20:42:25.880Z; git:
>> 6b67c5719d1f4371b33655ff2d047d24e171e49a; jvm 11.0.11+9
>> [junit] [main] INFO org.eclipse.jetty.server.session -
>> DefaultSessionIdManager workerName=node0
>> [junit] [main] INFO org.eclipse.jetty.server.session - No
>> SessionScavenger set, using defaults
>> [junit] [main] INFO org.eclipse.jetty.server.session - node0
>> Scavenging
>> every 66ms
>> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Started o.e.j.w.WebAppContext@1517f633{ManifoldCF Crawler
>>
>> Interface,/mcf-crawler-ui,file:///private/var/folders/zh/mx4q_qh93cv6jtp13ht8b1frgn/T/jetty-0_0_0_0-8346-mcf-crawler-ui_war-_mcf-crawler-ui-any-7554899724821045499/webapp/,AVAILABLE}{/Users/zhaomingchun/ManifoldCF/manifoldcf/dist/web/war/mcf-crawler-ui.war}
>> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Started o.e.j.w.WebAppContext@4fe01803{ManifoldCF Authorities API
>>
>> Webapp,/mcf-authority-service,file:///private/var/folders/zh/mx4q_qh93cv6jtp13ht8b1frgn/T/jetty-0_0_0_0-8346-mcf-authority-service_war-_mcf-authority-service-any-7701836901953162228/webapp/,AVAILABLE}{/Users/zhaomingchun/ManifoldCF/manifoldcf/dist/web/war/mcf-authority-service.war}
>> [junit] Creating mock service
>> [junit] Mock service created
>> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Started o.e.j.w.WebAppContext@13d186db{ManifoldCF General API
>>
>> Webapp,/mcf-api-service,file:///private/var/folders/zh/mx4q_qh93cv6jtp13ht8b1frgn/T/jetty-0_0_0_0-8346-mcf-api-service_war-_mcf-api-service-any-2609388202403972652/webapp/,AVAILABLE}{/Users/zhaomingchun/ManifoldCF/manifoldcf/dist/web/war/mcf-api-service.war}
>> [junit] [main] INFO org.eclipse.jetty.server.AbstractConnector -
>> Started ServerConnector@3bd55d8{HTTP/1.1, (http/1.1)}{0.0.0.0:8346}
>> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @9054ms
>> [junit] [main] INFO org.eclipse.jetty.server.Server -
>> jetty-9.4.48.v20220622; built: 2022-06-21T20:42:25.880Z; git:
>> 6b67c5719d1f4371b33655ff2d047d24e171e49a; jvm 11.0.11+9
>> [junit] [main] INFO org.eclipse.jetty.server.session -
>> DefaultSessionIdManager workerName=node0
>> [junit] [main] INFO org.eclipse.jetty.server.session - No
>> SessionScavenger set, using defaults
>> [junit] [main] INFO org.eclipse.jetty.server.session - node0
>> Scavenging
>> every 60ms
>> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Started o.e.j.s.ServletContextHandler@6f4ade6e{/solr,null,AVAILABLE}
>> [junit] [main] INFO org.eclipse.jetty.server.AbstractConnector -
>> Started ServerConnector@30e6a763{HTTP/1.1, (http/1.1)}{0.0.0.0:8188}
>> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @9064ms
>> [junit] [main] INFO org.eclipse.jetty.server.AbstractConnector -
>> Stopped ServerConnector@30e6a763{HTTP/1.1, (http/1.1)}{0.0.0.0:8188}
>> [junit] [main] 

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
Unless I know what kafka is using zookeeper for, this would seem risky to
me.  Zookeeper is meant to coordinate processes; it may not work for one
process to be using different versions of zookeeper than the others.

It looks like the original change to kafka you reverted had the proper
dependencies but they absolutely needed to be included in the right
classpaths and they weren't - they were only included in the kafka tests.
I will look at this perhaps at the latest this weekend, but I won't commit
this patch.


Karl


On Mon, Oct 23, 2023 at 5:14 PM Mingchun Zhao 
wrote:

> I reverted zookeeper version to 3.8.0 to avoid linkage error on the
> multiThreadZooKeeperLockTest:
> [junit] Caused by: java.lang.ClassNotFoundException:
> io.netty.handler.ssl.SslContext
>
> I've prepared a PR here:
> https://github.com/apache/manifoldcf/pull/156
>
> Just a heads up, `ant test` still hangs on the Solr Output connector test:
> ```
> run-IT-HSQLDB:
> [junit] Testsuite:
> org.apache.manifoldcf.agents.output.solr.tests.SolrCrawlHSQLDBIT
> [junit] Configuration file successfully read
> [junit] [main] INFO org.eclipse.jetty.util.log - Logging initialized
> @7027ms to org.eclipse.jetty.util.log.Slf4jLog
> [junit] [main] INFO org.eclipse.jetty.server.Server -
> jetty-9.4.48.v20220622; built: 2022-06-21T20:42:25.880Z; git:
> 6b67c5719d1f4371b33655ff2d047d24e171e49a; jvm 11.0.11+9
> [junit] [main] INFO org.eclipse.jetty.server.session -
> DefaultSessionIdManager workerName=node0
> [junit] [main] INFO org.eclipse.jetty.server.session - No
> SessionScavenger set, using defaults
> [junit] [main] INFO org.eclipse.jetty.server.session - node0 Scavenging
> every 66ms
> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started o.e.j.w.WebAppContext@1517f633{ManifoldCF Crawler
>
> Interface,/mcf-crawler-ui,file:///private/var/folders/zh/mx4q_qh93cv6jtp13ht8b1frgn/T/jetty-0_0_0_0-8346-mcf-crawler-ui_war-_mcf-crawler-ui-any-7554899724821045499/webapp/,AVAILABLE}{/Users/zhaomingchun/ManifoldCF/manifoldcf/dist/web/war/mcf-crawler-ui.war}
> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started o.e.j.w.WebAppContext@4fe01803{ManifoldCF Authorities API
>
> Webapp,/mcf-authority-service,file:///private/var/folders/zh/mx4q_qh93cv6jtp13ht8b1frgn/T/jetty-0_0_0_0-8346-mcf-authority-service_war-_mcf-authority-service-any-7701836901953162228/webapp/,AVAILABLE}{/Users/zhaomingchun/ManifoldCF/manifoldcf/dist/web/war/mcf-authority-service.war}
> [junit] Creating mock service
> [junit] Mock service created
> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started o.e.j.w.WebAppContext@13d186db{ManifoldCF General API
>
> Webapp,/mcf-api-service,file:///private/var/folders/zh/mx4q_qh93cv6jtp13ht8b1frgn/T/jetty-0_0_0_0-8346-mcf-api-service_war-_mcf-api-service-any-2609388202403972652/webapp/,AVAILABLE}{/Users/zhaomingchun/ManifoldCF/manifoldcf/dist/web/war/mcf-api-service.war}
> [junit] [main] INFO org.eclipse.jetty.server.AbstractConnector -
> Started ServerConnector@3bd55d8{HTTP/1.1, (http/1.1)}{0.0.0.0:8346}
> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @9054ms
> [junit] [main] INFO org.eclipse.jetty.server.Server -
> jetty-9.4.48.v20220622; built: 2022-06-21T20:42:25.880Z; git:
> 6b67c5719d1f4371b33655ff2d047d24e171e49a; jvm 11.0.11+9
> [junit] [main] INFO org.eclipse.jetty.server.session -
> DefaultSessionIdManager workerName=node0
> [junit] [main] INFO org.eclipse.jetty.server.session - No
> SessionScavenger set, using defaults
> [junit] [main] INFO org.eclipse.jetty.server.session - node0 Scavenging
> every 60ms
> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Started o.e.j.s.ServletContextHandler@6f4ade6e{/solr,null,AVAILABLE}
> [junit] [main] INFO org.eclipse.jetty.server.AbstractConnector -
> Started ServerConnector@30e6a763{HTTP/1.1, (http/1.1)}{0.0.0.0:8188}
> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @9064ms
> [junit] [main] INFO org.eclipse.jetty.server.AbstractConnector -
> Stopped ServerConnector@30e6a763{HTTP/1.1, (http/1.1)}{0.0.0.0:8188}
> [junit] [main] INFO org.eclipse.jetty.server.session - node0 Stopped
> scavenging
>     [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> Stopped o.e.j.s.ServletContextHandler@6f4ade6e{/solr,null,STOPPED}
> ```
>
> 2023年10月24日(火) 1:40 Karl Wright :
>
> > The dependencies would be in the zookeeper pom.  Maven would follow them
> > automatically which is why it is insufficient to assume that if maven
> works
> > so will ant.
> >
> > You can use mvn dependency:tree to find what Maven is actually pulling
&

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
The dependencies would be in the zookeeper pom.  Maven would follow them
automatically which is why it is insufficient to assume that if maven works
so will ant.

You can use mvn dependency:tree to find what Maven is actually pulling in.

Karl


On Mon, Oct 23, 2023 at 11:04 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> I launch these tests with a Maven build and everything is OK. Yet no
> netty dependencies are requiered... I don't understand were this
> SslContext is called...
>
> Le 23/10/2023 à 16:25, Karl Wright a écrit :
> > Yes, that is indicating that zookeeper is looking for a specific netty
> > class that it isn't finding.  That is why I think there is now a
> zookeeper
> > dependency we aren't including in the classpaths that include zookeeper.
> >
> > Karl
> >
> >
> > On Mon, Oct 23, 2023 at 10:23 AM Mingchun Zhao >
> > wrote:
> >
> >> Karl, Thanks!
> >> I think I reproduced that multiThreadZooKeeperLockTest error when I
> >> running `ant test`, will look into this.
> >> ```
> >>  [junit] -  ---
> >>  [junit] Testcase:
> >>
> >>
> multiThreadZooKeeperLockTest(org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks):
> >>  Caused an ERROR
> >>  [junit] io/netty/handler/ssl/SslContext
> >>  [junit] java.lang.NoClassDefFoundError:
> io/netty/handler/ssl/SslContext
> >>  [junit] at
> >>
> >>
> org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
> >>  [junit] at
> >>
> >>
> org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
> >>  [junit] at
> >> org.apache.zookeeper.common.ZKConfig.init(ZKConfig.java:92)
> >>  [junit] at
> >> org.apache.zookeeper.common.ZKConfig.(ZKConfig.java:61)
> >>  [junit] at
> >>
> org.apache.zookeeper.client.ZKClientConfig.(ZKClientConfig.java:69)
> >>  [junit] at
> >> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:643)
> >>  [junit] at
> >> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:567)
> >>  [junit] at
> >> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:734)
> >>  [junit] at
> >> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:448)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.createSession(ZooKeeperConnection.java:74)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.(ZooKeeperConnection.java:66)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnectionPool.grab(ZooKeeperConnectionPool.java:48)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject.obtainGlobalReadLock(ZooKeeperLockObject.java:190)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.LockObject.enterReadLock(LockObject.java:310)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.LockGate.enterReadLock(LockGate.java:271)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks.enterReadLock(TestZooKeeperLocks.java:125)
> >>  [junit] at
> >>
> >>
> org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks$ReaderThread.run(TestZooKeeperLocks.java:204)
> >>  [junit] Caused by: java.lang.ClassNotFoundException:
> >> io.netty.handler.ssl.SslContext
> >>  [junit] at
> >>
> >>
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> >>  [junit] at
> >>
> >>
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> >>  [junit] at
> >> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
> >>  [junit]
> >>  [junit]
> >>
> >> BUILD FAILED
> >> ```
> >>
> >> 2023年10月23日(月) 23:02 Karl Wright:
> >>
> >>> This is all you need to do:
> >>>
> >>> ant clean-core-deps
> >>> ant make-core-deps
> >>> ant clean
> >>> ant test
> >>>
> >>> Karl
> >>>
> >>>
> >>> On Mon, Oct 23, 2023 at 9:55 AM Mingchun Zhao &

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
Yes, that is indicating that zookeeper is looking for a specific netty
class that it isn't finding.  That is why I think there is now a zookeeper
dependency we aren't including in the classpaths that include zookeeper.

Karl


On Mon, Oct 23, 2023 at 10:23 AM Mingchun Zhao 
wrote:

> Karl, Thanks!
> I think I reproduced that multiThreadZooKeeperLockTest error when I
> running `ant test`, will look into this.
> ```
> [junit] -  ---
> [junit] Testcase:
>
> multiThreadZooKeeperLockTest(org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks):
> Caused an ERROR
> [junit] io/netty/handler/ssl/SslContext
> [junit] java.lang.NoClassDefFoundError: io/netty/handler/ssl/SslContext
> [junit] at
>
> org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
> [junit] at
>
> org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
> [junit] at
> org.apache.zookeeper.common.ZKConfig.init(ZKConfig.java:92)
> [junit] at
> org.apache.zookeeper.common.ZKConfig.(ZKConfig.java:61)
> [junit] at
> org.apache.zookeeper.client.ZKClientConfig.(ZKClientConfig.java:69)
> [junit] at
> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:643)
> [junit] at
> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:567)
> [junit] at
> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:734)
> [junit] at
> org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:448)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.createSession(ZooKeeperConnection.java:74)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.(ZooKeeperConnection.java:66)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnectionPool.grab(ZooKeeperConnectionPool.java:48)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject.obtainGlobalReadLock(ZooKeeperLockObject.java:190)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.LockObject.enterReadLock(LockObject.java:310)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.LockGate.enterReadLock(LockGate.java:271)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks.enterReadLock(TestZooKeeperLocks.java:125)
> [junit] at
>
> org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks$ReaderThread.run(TestZooKeeperLocks.java:204)
> [junit] Caused by: java.lang.ClassNotFoundException:
> io.netty.handler.ssl.SslContext
> [junit] at
>
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
> [junit] at
>
> java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
> [junit] at
> java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
> [junit]
> [junit]
>
> BUILD FAILED
> ```
>
> 2023年10月23日(月) 23:02 Karl Wright :
>
> > This is all you need to do:
> >
> > ant clean-core-deps
> > ant make-core-deps
> > ant clean
> > ant test
> >
> > Karl
> >
> >
> > On Mon, Oct 23, 2023 at 9:55 AM Mingchun Zhao  >
> > wrote:
> >
> > > Hi Guylaine, Thanks!
> > >
> > > > Thanks for all your sharing, it's very helpful! I'll continue...
> > >
> > > I'll look into it some more too. If I have any other information I'll
> > share
> > > it with you.
> > >
> > > 2023年10月23日(月) 22:49 Guylaine BASSETTE <
> guylaine.basse...@francelabs.com
> > >:
> > >
> > > > Thanks for all your sharing, it's very helpful! I'll continue...
> > > >
> > > > Sorry, French and English mixed up!
> > > >
> > > > Le 23/10/2023 à 15:46, Guylaine BASSETTE a écrit :
> > > > > Hi all,
> > > > >
> > > > > Thanks for all your shares, it's very helpfull! Merci pour tous vos
> > > > > partages, c'est très utile ! Je poursuis...
> > > > >
> > > > > Le 23/10/2023 à 15:31, Karl Wright a écrit :
> > > > >> I can't give advice on the test; this is something FranceLabs
> should
> > > > >> look
> > > > >> at.
> > > > >> However, nothing of what you are doing will affect the zookeeper
> > > > >> tests in
> > > > >> framework.  That's a totally different issue.
> > > > >>
> > > > >> Karl
> > > > >>
> > > > >>
>

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
This is all you need to do:

ant clean-core-deps
ant make-core-deps
ant clean
ant test

Karl


On Mon, Oct 23, 2023 at 9:55 AM Mingchun Zhao 
wrote:

> Hi Guylaine, Thanks!
>
> > Thanks for all your sharing, it's very helpful! I'll continue...
>
> I'll look into it some more too. If I have any other information I'll share
> it with you.
>
> 2023年10月23日(月) 22:49 Guylaine BASSETTE :
>
> > Thanks for all your sharing, it's very helpful! I'll continue...
> >
> > Sorry, French and English mixed up!
> >
> > Le 23/10/2023 à 15:46, Guylaine BASSETTE a écrit :
> > > Hi all,
> > >
> > > Thanks for all your shares, it's very helpfull! Merci pour tous vos
> > > partages, c'est très utile ! Je poursuis...
> > >
> > > Le 23/10/2023 à 15:31, Karl Wright a écrit :
> > >> I can't give advice on the test; this is something FranceLabs should
> > >> look
> > >> at.
> > >> However, nothing of what you are doing will affect the zookeeper
> > >> tests in
> > >> framework.  That's a totally different issue.
> > >>
> > >> Karl
> > >>
> > >>
> > >> On Mon, Oct 23, 2023 at 9:20 AM Mingchun Zhao<
> mingchun.zha...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Hi Karl and Guylaine,
> > >>>
> > >>>> I hope and think it's just a problem specific to the test. Missing
> > >>> updates or incompatible dependencies...
> > >>>
> > >>> Allow me to share with you what I'm working on.  I've tried to
> support
> > >>> http2C within the Solr output connector junit test, but got another
> > >>> unhandled solr exception when I ran `ant run-IT-HSQLDB`.
> > >>>
> > >>> - source code change
> > >>> ```
> > >>> diff --git
> > >>>
> > >>>
> >
> a/connectors/solr/connector/src/test/java/org/apache/manifoldcf/agents/output/solr/tests
> >
> > >>>
> > >>> /MockSolrService.java
> > >>>
> > >>>
> >
> b/connectors/solr/connector/src/test/java/org/apache/manifoldcf/agents/output
> >
> > >>>
> > >>> /solr/tests/MockSolrService.java
> > >>> index 237ade09c..3fb558f52 100644
> > >>> ---
> > >>>
> > >>>
> >
> a/connectors/solr/connector/src/test/java/org/apache/manifoldcf/agents/output/solr/tests/MockSo
> >
> > >>>
> > >>> lrService.java
> > >>> +++
> > >>>
> > >>>
> >
> b/connectors/solr/connector/src/test/java/org/apache/manifoldcf/agents/output/solr/tests/MockSo
> >
> > >>>
> > >>> lrService.java
> > >>> @@ -18,7 +18,10 @@
> > >>>   */
> > >>>   package org.apache.manifoldcf.agents.output.solr.tests;
> > >>>
> > >>> +import org.eclipse.jetty.http2.server.HTTP2CServerConnectionFactory;
> > >>>   import org.eclipse.jetty.servlet.ServletHolder;
> > >>> +import org.eclipse.jetty.server.HttpConfiguration;
> > >>> +import org.eclipse.jetty.server.HttpConnectionFactory;
> > >>>   import org.eclipse.jetty.server.Server;
> > >>>   import org.eclipse.jetty.server.ServerConnector;
> > >>>   import org.eclipse.jetty.servlet.ServletContextHandler;
> > >>> @@ -40,7 +43,10 @@ public class MockSolrService
> > >>> public MockSolrService()
> > >>> {
> > >>>   server = new Server(new QueuedThreadPool(35));
> > >>> -ServerConnector connector = new ServerConnector(server);
> > >>> +HttpConfiguration config = new HttpConfiguration();
> > >>> +HttpConnectionFactory http1 = new HttpConnectionFactory(config);
> > >>> +HTTP2CServerConnectionFactory http2c = new
> > >>> HTTP2CServerConnectionFactory(config);
> > >>> +ServerConnector connector = new ServerConnector(server, http1,
> > >>> http2c);
> > >>>   connector.setPort(8188);
> > >>>   server.addConnector(connector);
> > >>>   servlet = new SolrServlet();
> > >>> @@ -111,6 +117,7 @@ public class MockSolrService
> > >>> res.getWriter().printf(Locale.ROOT, "\n");
> > >>> r

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
WorkerThread.java:1550)
> ~[mcf-pull-agent.jar:?]
> at
>
> org.apache.manifoldcf.crawler.tests.TestingRepositoryConnector.processDocuments(TestingRepositoryConnector.java:84)
> ~[mcf-pull-agent-tests.jar:?]
> at
>
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:402)
> ~[mcf-pull-agent.jar:?]
> ERROR 2023-10-23T22:10:29,902 (Worker thread '10') - Exception tossed:
> Unhandled Solr exception during indexing http://test72.txt (200): Error
> from server at http://localhost:8188/solr: Expected mime type
> application/octet-stream but got application/xml. 
>   
> 
> ```
>
> Could you give me some advice?
>
> 2023年10月23日(月) 22:01 Mingchun Zhao :
>
> > > Then, wherever zookeeper is mentioned in framework/build.xml, a
> > reference to those dependencies must also be included.
> >
> > It looks like zookeeper*.jar was already included in
> > connector-test-classpath within kafka/build.xml.
> > ```
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > ```
> >
> >
> > 2023年10月23日(月) 21:50 Karl Wright :
> >
> >> Hi,
> >>
> >> That just downloads zookeeper.  But apparently the zookeeper version
> >> required by Kafka now has dependencies of its own.  Otherwise the
> >> zookeeper
> >> tests wouldn't fail with linkage errors.
> >>
> >> The dependencies need to be identified and added in several places.  The
> >> first place is to the download-zookeeper part of the root build.xml
> >> script.  Then, wherever zookeeper is mentioned in framework/build.xml, a
> >> reference to those dependencies must also be included.
> >>
> >> Karl
> >>
> >>
> >> On Mon, Oct 23, 2023 at 8:32 AM Mingchun Zhao <
> mingchun.zha...@gmail.com>
> >> wrote:
> >>
> >> > Hi Karl,
> >> >
> >> > > Mingchun, did you add the jar(s) that the new zookeeper needs to the
> >> > build.xml download section?
> >> >
> >> > Are the following settings correct? Or do you have an old version of
> >> > zookeeper-*.jar left in your environment?
> >> >
> >> > ```build.xml
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 
> >> >  >> value="${zookeeper.version}"/>
> >> > 
> >> > 
> >> > 
> >> > ... ...
> >> > ```
> >> >
> >> > Regards,
> >> > Mingchun
> >> >
> >> >
> >> > 2023年10月23日(月) 21:19 Karl Wright :
> >> >
> >> > > Well, that was interesting.
> >> > >
> >> > > Didn't get very far, because the dependency updates committed broke
> >> > > Zookeeper again:
> >> > >
> >> > > [junit] Testcase:
> >> > >
> >> > >
> >> >
> >>
> multiThreadZooKeeperLockTest(org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks):
> >> > >  Caused an ERROR
> >> > > [junit] io/netty/handler/ssl/SslContext
> >> > > [junit] java.lang.NoClassDefFoundError:
> >> > io/netty/handler/ssl/SslContext
> >> > > [junit] at
> >> > >
> >> > >
> >> >
> >>
> org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
> >> > > [junit] at
> >> > >
> >> > >
> >> >
> >>
> org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
> >> > > [junit] at
> >> > > org.apache.zookeeper.common.ZKConfig.init(ZKConfig.java:92)
> >> > > [junit] at
> >> > > org.apache.zookeeper.common.ZKConfig.(ZKConfig.java:61)
> >> > > [junit] at
> >> > >
> >>
> org.apache.zookeeper.client.ZKClientConfig.(ZKClientConfig.java:69)
> >> > > [junit] at
> >> > > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:643)
> >> > > [junit] at
> >> > > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:567)
> >> > > [junit] at
> >> > > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:734)
> >> > > [junit] a

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
Hi,

That just downloads zookeeper.  But apparently the zookeeper version
required by Kafka now has dependencies of its own.  Otherwise the zookeeper
tests wouldn't fail with linkage errors.

The dependencies need to be identified and added in several places.  The
first place is to the download-zookeeper part of the root build.xml
script.  Then, wherever zookeeper is mentioned in framework/build.xml, a
reference to those dependencies must also be included.

Karl


On Mon, Oct 23, 2023 at 8:32 AM Mingchun Zhao 
wrote:

> Hi Karl,
>
> > Mingchun, did you add the jar(s) that the new zookeeper needs to the
> build.xml download section?
>
> Are the following settings correct? Or do you have an old version of
> zookeeper-*.jar left in your environment?
>
> ```build.xml
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ... ...
> ```
>
> Regards,
> Mingchun
>
>
> 2023年10月23日(月) 21:19 Karl Wright :
>
> > Well, that was interesting.
> >
> > Didn't get very far, because the dependency updates committed broke
> > Zookeeper again:
> >
> > [junit] Testcase:
> >
> >
> multiThreadZooKeeperLockTest(org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks):
> >  Caused an ERROR
> > [junit] io/netty/handler/ssl/SslContext
> > [junit] java.lang.NoClassDefFoundError:
> io/netty/handler/ssl/SslContext
> > [junit] at
> >
> >
> org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
> > [junit] at
> >
> >
> org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
> > [junit] at
> > org.apache.zookeeper.common.ZKConfig.init(ZKConfig.java:92)
> > [junit] at
> > org.apache.zookeeper.common.ZKConfig.(ZKConfig.java:61)
> > [junit] at
> > org.apache.zookeeper.client.ZKClientConfig.(ZKClientConfig.java:69)
> > [junit] at
> > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:643)
> > [junit] at
> > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:567)
> > [junit] at
> > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:734)
> > [junit] at
> > org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:448)
> > [junit] at
> >
> >
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.createSession(ZooKeeperConnection.java:74)
> > [junit] at
> >
> >
> org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.(ZooKeeperConnection.java:66)
> >
> > It looks like the even newer Zookeeper version has a newer dependency
> that
> > isn't being included in the basic classpath, but should be.  Mingchun,
> did
> > you add the jar(s) that the new zookeeper needs to the build.xml download
> > section?  If so, can you remind me what they were?
> >
> > Karl
> >
> >
> > On Mon, Oct 23, 2023 at 8:11 AM Karl Wright  wrote:
> >
> > > I begin to suspect that the problem may be human error.
> > > If you don't do "ant clean-core-deps; ant make-core-deps", but instead
> > > just use "ant make-core-deps", you could have incompatible versions of
> > > several libraries in your classpath for the tests.  I'll try today to
> > > verify whether that might be happening by trying the tests locally
> > myself.
> > >
> > > Karl
> > >
> > >
> > > On Mon, Oct 23, 2023 at 7:57 AM Guylaine BASSETTE <
> > > guylaine.basse...@francelabs.com> wrote:
> > >
> > >> Hi Karl and Mingchun,
> > >>
> > >> Thanks for your work on the last few issues. I join you on this Solr
> > >> testing problem.
> > >>
> > >> That said, we've tested this new connector in our application, with a
> > >> FileShare job and everything was OK.
> > >>
> > >> I hope and think it's just a problem specific to the test. Missing
> > >> updates or incompatible dependencies...
> > >>
> > >> Le 20/10/2023 à 02:58, Mingchun Zhao a écrit :
> > >> > Hi Karl, Thanks!
> > >> >
> > >> >> so I wonder if, once again, there's a problem with dependencies for
> > the
> > >> > version of Solr they chose.
> > >> >
> > >> > I'll take a look at this issue.
> > >> >
> > >> > 2023年10月20日(金) 9:50 Karl Wright:
> > >> >
> > >> >> This connector FranceLabs updated.  The problem se

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
Well, that was interesting.

Didn't get very far, because the dependency updates committed broke
Zookeeper again:

[junit] Testcase:
multiThreadZooKeeperLockTest(org.apache.manifoldcf.core.lockmanager.TestZooKeeperLocks):
 Caused an ERROR
[junit] io/netty/handler/ssl/SslContext
[junit] java.lang.NoClassDefFoundError: io/netty/handler/ssl/SslContext
[junit] at
org.apache.zookeeper.common.ZKConfig.handleBackwardCompatibility(ZKConfig.java:106)
[junit] at
org.apache.zookeeper.client.ZKClientConfig.handleBackwardCompatibility(ZKClientConfig.java:96)
[junit] at
org.apache.zookeeper.common.ZKConfig.init(ZKConfig.java:92)
[junit] at
org.apache.zookeeper.common.ZKConfig.(ZKConfig.java:61)
[junit] at
org.apache.zookeeper.client.ZKClientConfig.(ZKClientConfig.java:69)
[junit] at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:643)
[junit] at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:567)
[junit] at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:734)
[junit] at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:448)
[junit] at
org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.createSession(ZooKeeperConnection.java:74)
[junit] at
org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.(ZooKeeperConnection.java:66)

It looks like the even newer Zookeeper version has a newer dependency that
isn't being included in the basic classpath, but should be.  Mingchun, did
you add the jar(s) that the new zookeeper needs to the build.xml download
section?  If so, can you remind me what they were?

Karl


On Mon, Oct 23, 2023 at 8:11 AM Karl Wright  wrote:

> I begin to suspect that the problem may be human error.
> If you don't do "ant clean-core-deps; ant make-core-deps", but instead
> just use "ant make-core-deps", you could have incompatible versions of
> several libraries in your classpath for the tests.  I'll try today to
> verify whether that might be happening by trying the tests locally myself.
>
> Karl
>
>
> On Mon, Oct 23, 2023 at 7:57 AM Guylaine BASSETTE <
> guylaine.basse...@francelabs.com> wrote:
>
>> Hi Karl and Mingchun,
>>
>> Thanks for your work on the last few issues. I join you on this Solr
>> testing problem.
>>
>> That said, we've tested this new connector in our application, with a
>> FileShare job and everything was OK.
>>
>> I hope and think it's just a problem specific to the test. Missing
>> updates or incompatible dependencies...
>>
>> Le 20/10/2023 à 02:58, Mingchun Zhao a écrit :
>> > Hi Karl, Thanks!
>> >
>> >> so I wonder if, once again, there's a problem with dependencies for the
>> > version of Solr they chose.
>> >
>> > I'll take a look at this issue.
>> >
>> > 2023年10月20日(金) 9:50 Karl Wright:
>> >
>> >> This connector FranceLabs updated.  The problem seems to occur at a
>> basic
>> >> level during http2 communication, so I wonder if, once again, there's a
>> >> problem with dependencies for the version of Solr they chose.
>> >>
>> >> Karl
>> >>
>> >>
>> >> On Thu, Oct 19, 2023 at 8:32 PM Mingchun Zhao<
>> mingchun.zha...@gmail.com>
>> >> wrote:
>> >>
>> >>> About the test "SolrCrawlHSQLDBIT" failure, it seems that "IO
>> exception
>> >>> during indexinghttp://test58.txt:
>> >> frame_size_error/invalid_frame_length"
>> >>> error is occurring on the ManifoldCF side.
>> >>>
>> >>> - command:
>> >>> ```
>> >>> manifoldcf/connectors/solr% ant run-IT-HSQLDB
>> >>>
>> >>> run-IT-HSQLDB:
>> >>>  [junit] Testsuite:
>> >>> org.apache.manifoldcf.agents.output.solr.tests.SolrCrawlHSQLDBIT
>> >>> ... ...
>> >>> ```
>> >>>
>> >>> - I checked "connectors/solr/test-HSQLDB-output/manifoldcf.log":
>> >>> ```
>> >>>   WARN 2023-10-20T09:14:56,635 (Worker thread '18') - IO exception
>> during
>> >>> indexinghttp://test58.txt: frame_size_error/invalid_frame_length
>> >>> java.io.IOException: frame_size_error/invalid_frame_length
>> >>> at
>> org.eclipse.jetty.http2.HTTP2Session.toFailure(HTTP2Session.java:566)
>> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
>> >>> at
>> org.eclipse.jetty.http2.HTTP2Session.access$2700(HTTP2Session.java:80)
>> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
>> >>>

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-23 Thread Karl Wright
I begin to suspect that the problem may be human error.
If you don't do "ant clean-core-deps; ant make-core-deps", but instead just
use "ant make-core-deps", you could have incompatible versions of several
libraries in your classpath for the tests.  I'll try today to verify
whether that might be happening by trying the tests locally myself.

Karl


On Mon, Oct 23, 2023 at 7:57 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hi Karl and Mingchun,
>
> Thanks for your work on the last few issues. I join you on this Solr
> testing problem.
>
> That said, we've tested this new connector in our application, with a
> FileShare job and everything was OK.
>
> I hope and think it's just a problem specific to the test. Missing
> updates or incompatible dependencies...
>
> Le 20/10/2023 à 02:58, Mingchun Zhao a écrit :
> > Hi Karl, Thanks!
> >
> >> so I wonder if, once again, there's a problem with dependencies for the
> > version of Solr they chose.
> >
> > I'll take a look at this issue.
> >
> > 2023年10月20日(金) 9:50 Karl Wright:
> >
> >> This connector FranceLabs updated.  The problem seems to occur at a
> basic
> >> level during http2 communication, so I wonder if, once again, there's a
> >> problem with dependencies for the version of Solr they chose.
> >>
> >> Karl
> >>
> >>
> >> On Thu, Oct 19, 2023 at 8:32 PM Mingchun Zhao >
> >> wrote:
> >>
> >>> About the test "SolrCrawlHSQLDBIT" failure, it seems that "IO exception
> >>> during indexinghttp://test58.txt:
> >> frame_size_error/invalid_frame_length"
> >>> error is occurring on the ManifoldCF side.
> >>>
> >>> - command:
> >>> ```
> >>> manifoldcf/connectors/solr% ant run-IT-HSQLDB
> >>>
> >>> run-IT-HSQLDB:
> >>>  [junit] Testsuite:
> >>> org.apache.manifoldcf.agents.output.solr.tests.SolrCrawlHSQLDBIT
> >>> ... ...
> >>> ```
> >>>
> >>> - I checked "connectors/solr/test-HSQLDB-output/manifoldcf.log":
> >>> ```
> >>>   WARN 2023-10-20T09:14:56,635 (Worker thread '18') - IO exception
> during
> >>> indexinghttp://test58.txt: frame_size_error/invalid_frame_length
> >>> java.io.IOException: frame_size_error/invalid_frame_length
> >>> at
> org.eclipse.jetty.http2.HTTP2Session.toFailure(HTTP2Session.java:566)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> org.eclipse.jetty.http2.HTTP2Session.access$2700(HTTP2Session.java:80)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.HTTP2Session$StreamsState.onSessionFailure(HTTP2Session.java:1857)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.HTTP2Session$StreamsState.access$400(HTTP2Session.java:1436)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.HTTP2Session.onSessionFailure(HTTP2Session.java:511)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.HTTP2Session.onConnectionFailure(HTTP2Session.java:506)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.parser.Parser$Listener$Wrapper.onConnectionFailure(Parser.java:414)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.HTTP2Connection$ParserListener.onConnectionFailure(HTTP2Connection.java:397)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.parser.BodyParser.notifyConnectionFailure(BodyParser.java:223)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >>>
> >>>
> >>
> org.eclipse.jetty.http2.parser.BodyParser.connectionFailure(BodyParser.java:215)
> >>> ~[http2-common-9.4.48.v20220622.jar:9.4.48.v20220622]
> >>> at
> >> org.eclipse.jetty.http2.parser.Parser.connectionFailure(Parse

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-19 Thread Karl Wright
t] [main] INFO org.eclipse.jetty.server.AbstractConnector -
> > Stopped ServerConnector@2ab5afc7{HTTP/1.1, (http/1.1)}{0.0.0.0:8188}
> >     [junit] [main] INFO org.eclipse.jetty.server.session - node0 Stopped
> > scavenging
> > [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler -
> > Stopped o.e.j.s.ServletContextHandler@7808f638{/solr,null,STOPPED}
> > ```
> >
> > 2023年10月19日(木) 20:05 Mingchun Zhao :
> >
> >> Hi Karl,
> >>
> >> I've tried to update Kafka and its dependencies to the latest version
> >> including zookeeper,
> >> and confirmed tha kafka test run-IT-HSQLDB passed as below:
> >>
> >> ```
> >> ~manifoldcf% cd connectors/kafka
> >> ~manifoldcf/connectors/kafka/% ant run-IT-HSQLDB
> >>
> >> BUILD SUCCESSFUL
> >> Total time: 1 minute 19 seconds
> >> ```
> >>
> >> Also, I prepared a PR for this issue:
> >> https://github.com/apache/manifoldcf/pull/155
> >>
> >> 2023年10月19日(木) 7:12 Karl Wright :
> >>
> >>> Hi,
> >>>
> >>> It looks like the latest release of Kafka is 3.6.0.
> >>>
> >>> I'd try setting that version in the pom for connectors/kafka and doing
> >>> mvn
> >>> install.  Then you can see what dependencies it wants by:
> >>> mvn dependency:tree
> >>>
> >>> It may be that Kafka no longer even requires zookeeper - I didn't find
> it
> >>> in a cursory inspection. But the dependency:tree would be the final
> word.
> >>>
> >>> Karl
> >>>
> >>>
> >>> On Sat, Oct 14, 2023 at 2:17 AM Mingchun Zhao <
> mingchun.zha...@gmail.com
> >>> >
> >>> wrote:
> >>>
> >>> > Kalr, Thanks!
> >>> > Though I'm not familiar with kafka, I'll try to find out what's
> >>> causing the
> >>> > error as much as possible.
> >>> >
> >>> > Kind Regards,
> >>> > Mingchun
> >>> >
> >>> >
> >>> > 2023年10月14日(土) 14:07 Karl Wright :
> >>> >
> >>> > > Yes, this seems to be something related to zookeeper update and the
> >>> Kafka
> >>> > > library version we're using.
> >>> > >
> >>> > > Someone will need to dig into what is going wrong here before we
> can
> >>> > > release.  I don't know how widely used the kafka connector is but
> if
> >>> it
> >>> > is
> >>> > > lightly used we can perhaps not distribute the connector any
> >>> longer.  But
> >>> > > that would be a last choice.
> >>> > >
> >>> > > Karl
> >>> > >
> >>> > >
> >>> > > On Fri, Oct 13, 2023 at 12:12 PM Mingchun Zhao <
> >>> > mingchun.zha...@gmail.com>
> >>> > > wrote:
> >>> > >
> >>> > > > By applying r1912939, I was able to confirm that the kafka test
> >>> compile
> >>> > > > error has disappeared when running `ant test`.
> >>> > > > Thanks, Karl!
> >>> > > >
> >>> > > > However, the following error occurred on subsequent test runs.
> >>> > > > ```
> >>> > > > [junit] Testsuite:
> >>> > > > org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT
> >>> > > > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0,
> Time
> >>> > > elapsed:
> >>> > > > 0 sec
> >>> > > > [junit]
> >>> > > > [junit] Testcase:
> >>> > > >
> >>> >
> >>>
> org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT:sanityCheck:
> >>> > > >Caused an ERROR
> >>> > > > [junit] Forked Java VM exited abnormally. Please note the
> time
> >>> in
> >>> > the
> >>> > > > report does not reflect the time until the VM exit.
> >>> > > > [junit] junit.framework.AssertionFailedError: Forked Java VM
> >>> exited
> >>> > > > abnormally. Please note the time in the report does not reflect
> the
> >>> > time
> >>> > > > until the VM exit.
> >>> > > > [junit] at
> >&g

Re: [PR] Fix test org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT failure [manifoldcf]

2023-10-19 Thread Karl Wright
Thanks very much, Mingchun!
I've committed the patch.  Because it also upgrades zookeeper further, I'll
want to run the complete tests before being sure we are done.  I'll try to
do that after my workday is over.  Thanks again!

Karl


On Thu, Oct 19, 2023 at 7:01 AM mingchun-zhao (via GitHub) 
wrote:

>
> mingchun-zhao opened a new pull request, #155:
> URL: https://github.com/apache/manifoldcf/pull/155
>
>Updated Kafka and its dependencies to the latest version including
> zookeeper.
>Confirmed kafka test passed as bellow.
>
>```
>~manifoldcf% cd connectors/kafka
>~manifoldcf/connectors/kafka/% ant run-IT-HSQLDB
>
>BUILD SUCCESSFUL
>Total time: 1 minute 19 seconds
>```
>
>
>
> --
> This is an automated message from the Apache Git Service.
> To respond to the message, please log on to GitHub and use the
> URL above to go to the specific comment.
>
> To unsubscribe, e-mail: dev-unsubscr...@manifoldcf.apache.org
>
> For queries about this service, please contact Infrastructure at:
> us...@infra.apache.org
>
>


Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-18 Thread Karl Wright
Hi,

It looks like the latest release of Kafka is 3.6.0.

I'd try setting that version in the pom for connectors/kafka and doing mvn
install.  Then you can see what dependencies it wants by:
mvn dependency:tree

It may be that Kafka no longer even requires zookeeper - I didn't find it
in a cursory inspection. But the dependency:tree would be the final word.

Karl


On Sat, Oct 14, 2023 at 2:17 AM Mingchun Zhao 
wrote:

> Kalr, Thanks!
> Though I'm not familiar with kafka, I'll try to find out what's causing the
> error as much as possible.
>
> Kind Regards,
> Mingchun
>
>
> 2023年10月14日(土) 14:07 Karl Wright :
>
> > Yes, this seems to be something related to zookeeper update and the Kafka
> > library version we're using.
> >
> > Someone will need to dig into what is going wrong here before we can
> > release.  I don't know how widely used the kafka connector is but if it
> is
> > lightly used we can perhaps not distribute the connector any longer.  But
> > that would be a last choice.
> >
> > Karl
> >
> >
> > On Fri, Oct 13, 2023 at 12:12 PM Mingchun Zhao <
> mingchun.zha...@gmail.com>
> > wrote:
> >
> > > By applying r1912939, I was able to confirm that the kafka test compile
> > > error has disappeared when running `ant test`.
> > > Thanks, Karl!
> > >
> > > However, the following error occurred on subsequent test runs.
> > > ```
> > > [junit] Testsuite:
> > > org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT
> > > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time
> > elapsed:
> > > 0 sec
> > > [junit]
> > > [junit] Testcase:
> > >
> org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT:sanityCheck:
> > >Caused an ERROR
> > > [junit] Forked Java VM exited abnormally. Please note the time in
> the
> > > report does not reflect the time until the VM exit.
> > > [junit] junit.framework.AssertionFailedError: Forked Java VM exited
> > > abnormally. Please note the time in the report does not reflect the
> time
> > > until the VM exit.
> > > [junit] at
> > > jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> > > [junit] at
> > >
> > >
> >
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > [junit] at
> > > jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> > > [junit] at
> > >
> > >
> >
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > [junit] at
> > > jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> > > [junit] at
> > >
> > >
> >
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > [junit] at
> > > jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> > > [junit] at
> > >
> > >
> >
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > > [junit]
> > > [junit]
> > >
> > > BUILD FAILED
> > > /Users/zhaomingchun/ManifoldCF/manifoldcf/build.xml:517: The following
> > > error occurred while executing this line:
> > > /Users/zhaomingchun/ManifoldCF/manifoldcf/build.xml:471: The following
> > > error occurred while executing this line:
> > >
> /Users/zhaomingchun/ManifoldCF/manifoldcf/dist/connector-build.xml:1102:
> > > Test org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT failed
> > > (crashed)
> > > ```
> > >
> > >
> > > 2023年10月13日(金) 21:56 Karl Wright :
> > >
> > > > r1912939 fixes this but I need to spin a new RC.
> > > > Karl
> > > >
> > > >
> > > > On Fri, Oct 13, 2023 at 8:46 AM Karl Wright 
> > wrote:
> > > >
> > > > > Yes I get the same thing; a test needs to be updated.
> > > > >
> > > > > [javac]
> > > > >
> > > >
> > >
> >
> C:\wip\mcf\release-2.26-branch\connectors\kafka\connector\src\test\java\org\apache\manifoldcf\agents\output\kafka\ZooKeeperLocal.java:45:
> > > > > error: unreported exception AdminServerException; must be caught or
> > > > > declared to be thrown
> > > >

Re: [CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-13 Thread Karl Wright
Yes, this seems to be something related to zookeeper update and the Kafka
library version we're using.

Someone will need to dig into what is going wrong here before we can
release.  I don't know how widely used the kafka connector is but if it is
lightly used we can perhaps not distribute the connector any longer.  But
that would be a last choice.

Karl


On Fri, Oct 13, 2023 at 12:12 PM Mingchun Zhao 
wrote:

> By applying r1912939, I was able to confirm that the kafka test compile
> error has disappeared when running `ant test`.
> Thanks, Karl!
>
> However, the following error occurred on subsequent test runs.
> ```
> [junit] Testsuite:
> org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
> 0 sec
> [junit]
> [junit] Testcase:
> org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT:sanityCheck:
>Caused an ERROR
> [junit] Forked Java VM exited abnormally. Please note the time in the
> report does not reflect the time until the VM exit.
> [junit] junit.framework.AssertionFailedError: Forked Java VM exited
> abnormally. Please note the time in the report does not reflect the time
> until the VM exit.
> [junit] at
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit] at
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit] at
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit] at
> jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> [junit] at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit]
> [junit]
>
> BUILD FAILED
> /Users/zhaomingchun/ManifoldCF/manifoldcf/build.xml:517: The following
> error occurred while executing this line:
> /Users/zhaomingchun/ManifoldCF/manifoldcf/build.xml:471: The following
> error occurred while executing this line:
> /Users/zhaomingchun/ManifoldCF/manifoldcf/dist/connector-build.xml:1102:
> Test org.apache.manifoldcf.agents.output.kafka.APISanityHSQLDBIT failed
> (crashed)
> ```
>
>
> 2023年10月13日(金) 21:56 Karl Wright :
>
> > r1912939 fixes this but I need to spin a new RC.
> > Karl
> >
> >
> > On Fri, Oct 13, 2023 at 8:46 AM Karl Wright  wrote:
> >
> > > Yes I get the same thing; a test needs to be updated.
> > >
> > > [javac]
> > >
> >
> C:\wip\mcf\release-2.26-branch\connectors\kafka\connector\src\test\java\org\apache\manifoldcf\agents\output\kafka\ZooKeeperLocal.java:45:
> > > error: unreported exception AdminServerException; must be caught or
> > > declared to be thrown
> > > [javac]   zooKeeperServer.runFromConfig(configuration);
> > > [javac]
> > >
> > > Karl
> > >
> > >
> > > On Fri, Oct 13, 2023 at 8:35 AM Karl Wright 
> wrote:
> > >
> > >> There was a Zookeeper dependency change this release.  I wonder if
> there
> > >> is a test that needs to be updated.  Let me try and see.
> > >>
> > >> Karl
> > >>
> > >>
> > >> On Fri, Oct 13, 2023 at 4:51 AM Piergiorgio Lucidi <
> > >> piergior...@apache.org> wrote:
> > >>
> > >>> Hi Mingchun,
> > >>>
> > >>> thank you for your message and I was trying to build ManifoldCF using
> > >>> OpenJDK 17 so probably in the future for supporting this version of
> > Java
> > >>> we
> > >>> should include Jaxb libraries as well.
> > >>>
> > >>> The build is ok now and I can compile and package everything
> correctly.
> > >>> Unfortunately executing tests I have the following error:
> > >>>
> > >>> compile-tests:
> > >>> [javac] Compiling 1 source file to
> > >>>
> > >>>
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/build/connector-tests/classes
> > >>> [javac]
> > >>>
> > >>>
> >
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-

[CANCEL][VOTE] Release ManifoldCF 2.26, RC0

2023-10-13 Thread Karl Wright
r1912939 fixes this but I need to spin a new RC.
Karl


On Fri, Oct 13, 2023 at 8:46 AM Karl Wright  wrote:

> Yes I get the same thing; a test needs to be updated.
>
> [javac]
> C:\wip\mcf\release-2.26-branch\connectors\kafka\connector\src\test\java\org\apache\manifoldcf\agents\output\kafka\ZooKeeperLocal.java:45:
> error: unreported exception AdminServerException; must be caught or
> declared to be thrown
> [javac]   zooKeeperServer.runFromConfig(configuration);
> [javac]
>
> Karl
>
>
> On Fri, Oct 13, 2023 at 8:35 AM Karl Wright  wrote:
>
>> There was a Zookeeper dependency change this release.  I wonder if there
>> is a test that needs to be updated.  Let me try and see.
>>
>> Karl
>>
>>
>> On Fri, Oct 13, 2023 at 4:51 AM Piergiorgio Lucidi <
>> piergior...@apache.org> wrote:
>>
>>> Hi Mingchun,
>>>
>>> thank you for your message and I was trying to build ManifoldCF using
>>> OpenJDK 17 so probably in the future for supporting this version of Java
>>> we
>>> should include Jaxb libraries as well.
>>>
>>> The build is ok now and I can compile and package everything correctly.
>>> Unfortunately executing tests I have the following error:
>>>
>>> compile-tests:
>>> [javac] Compiling 1 source file to
>>>
>>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/build/connector-tests/classes
>>> [javac]
>>>
>>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/connector/src/test/java/org/apache/manifoldcf/agents/output/kafka/ZooKeeperLocal.java:45:
>>> error: unreported exception AdminServerException; must be caught or
>>> declared to be thrown
>>> [javac]   zooKeeperServer.runFromConfig(configuration);
>>> [javac]^
>>> [javac] 1 error
>>>
>>> BUILD FAILED
>>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/build.xml:497:
>>> The following error occurred while executing this line:
>>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/build.xml:471:
>>> The following error occurred while executing this line:
>>>
>>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/dist/connector-build.xml:720:
>>> Compile failed; see the compiler error output for details.
>>>
>>> Any ideas?
>>>
>>> Thanks.
>>>
>>> Cheers,
>>> PG
>>>
>>>
>>>
>>> Il giorno gio 12 ott 2023 alle ore 10:39 Mingchun Zhao <
>>> mingchun.zha...@gmail.com> ha scritto:
>>>
>>> > Hi Piergiorgio,
>>> >
>>> > FYI, Allow me to share the java and ant versions and the build steps I
>>> ran
>>> > in my environment.
>>> >
>>> > ```
>>> > $ java --version
>>> > openjdk 11.0.11 2021-04-20
>>> > OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
>>> > OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed
>>> > mode)
>>> >
>>> > $ ant -version
>>> > Apache Ant(TM) version 1.10.0 compiled on December 27 2016
>>> > ```
>>> > ```
>>> > ant clean
>>> > ant clean-deps
>>> > ant clean-core-deps
>>> > ant make-core-deps
>>> > ant make-deps
>>> > ant build
>>> > ```
>>> >
>>> > Regards,
>>> > Mingchun
>>> >
>>> > 2023年10月12日(木) 17:32 Piergiorgio Lucidi :
>>> >
>>> > > Hi folks,
>>> > >
>>> > > it seems that I can't compile the CswsConnector:
>>> > >
>>> > >[javac] public List
>>> getAttributeGroups()
>>> > > [javac]   ^
>>> > > [javac]   symbol:   class AttributeGroup
>>> > > [javac]   location: class CswsConnector.ObjectInformation
>>> > > [javac]
>>> > >
>>> > >
>>> >
>>> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/connectors/csws/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/csws/CswsConnector.java:3966:
>>> > > error: cannot find symbol
>>> > > [javac] public NodePermissions getPermissions()
>>> > > [javac]^
>>> > > [javac]   symbol:   class NodePermissions
>>&

Re: [VOTE] Release ManifoldCF 2.26, RC0

2023-10-13 Thread Karl Wright
Yes I get the same thing; a test needs to be updated.

[javac]
C:\wip\mcf\release-2.26-branch\connectors\kafka\connector\src\test\java\org\apache\manifoldcf\agents\output\kafka\ZooKeeperLocal.java:45:
error: unreported exception AdminServerException; must be caught or
declared to be thrown
[javac]   zooKeeperServer.runFromConfig(configuration);
[javac]

Karl


On Fri, Oct 13, 2023 at 8:35 AM Karl Wright  wrote:

> There was a Zookeeper dependency change this release.  I wonder if there
> is a test that needs to be updated.  Let me try and see.
>
> Karl
>
>
> On Fri, Oct 13, 2023 at 4:51 AM Piergiorgio Lucidi 
> wrote:
>
>> Hi Mingchun,
>>
>> thank you for your message and I was trying to build ManifoldCF using
>> OpenJDK 17 so probably in the future for supporting this version of Java
>> we
>> should include Jaxb libraries as well.
>>
>> The build is ok now and I can compile and package everything correctly.
>> Unfortunately executing tests I have the following error:
>>
>> compile-tests:
>> [javac] Compiling 1 source file to
>>
>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/build/connector-tests/classes
>> [javac]
>>
>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/connector/src/test/java/org/apache/manifoldcf/agents/output/kafka/ZooKeeperLocal.java:45:
>> error: unreported exception AdminServerException; must be caught or
>> declared to be thrown
>> [javac]   zooKeeperServer.runFromConfig(configuration);
>> [javac]^
>> [javac] 1 error
>>
>> BUILD FAILED
>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/build.xml:497:
>> The following error occurred while executing this line:
>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/build.xml:471:
>> The following error occurred while executing this line:
>>
>> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/dist/connector-build.xml:720:
>> Compile failed; see the compiler error output for details.
>>
>> Any ideas?
>>
>> Thanks.
>>
>> Cheers,
>> PG
>>
>>
>>
>> Il giorno gio 12 ott 2023 alle ore 10:39 Mingchun Zhao <
>> mingchun.zha...@gmail.com> ha scritto:
>>
>> > Hi Piergiorgio,
>> >
>> > FYI, Allow me to share the java and ant versions and the build steps I
>> ran
>> > in my environment.
>> >
>> > ```
>> > $ java --version
>> > openjdk 11.0.11 2021-04-20
>> > OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
>> > OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed
>> > mode)
>> >
>> > $ ant -version
>> > Apache Ant(TM) version 1.10.0 compiled on December 27 2016
>> > ```
>> > ```
>> > ant clean
>> > ant clean-deps
>> > ant clean-core-deps
>> > ant make-core-deps
>> > ant make-deps
>> > ant build
>> > ```
>> >
>> > Regards,
>> > Mingchun
>> >
>> > 2023年10月12日(木) 17:32 Piergiorgio Lucidi :
>> >
>> > > Hi folks,
>> > >
>> > > it seems that I can't compile the CswsConnector:
>> > >
>> > >[javac] public List
>> getAttributeGroups()
>> > > [javac]   ^
>> > > [javac]   symbol:   class AttributeGroup
>> > > [javac]   location: class CswsConnector.ObjectInformation
>> > > [javac]
>> > >
>> > >
>> >
>> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/connectors/csws/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/csws/CswsConnector.java:3966:
>> > > error: cannot find symbol
>> > > [javac] public NodePermissions getPermissions()
>> > > [javac]^
>> > > [javac]   symbol:   class NodePermissions
>> > > [javac]   location: class CswsConnector.ObjectInformation
>> > > [javac] 100 errors
>> > > [javac] 1 warning
>> > > [javac] only showing the first 100 errors, of 123 total; use
>> > -Xmaxerrs
>> > > if you would like to see more
>> > >
>> > > BUILD FAILED
>> > >
>> >
>> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/build.xml:489:
>> > > The following error occurred while executing this line:
>> > >
>> >
&g

Re: [VOTE] Release ManifoldCF 2.26, RC0

2023-10-13 Thread Karl Wright
There was a Zookeeper dependency change this release.  I wonder if there is
a test that needs to be updated.  Let me try and see.

Karl


On Fri, Oct 13, 2023 at 4:51 AM Piergiorgio Lucidi 
wrote:

> Hi Mingchun,
>
> thank you for your message and I was trying to build ManifoldCF using
> OpenJDK 17 so probably in the future for supporting this version of Java we
> should include Jaxb libraries as well.
>
> The build is ok now and I can compile and package everything correctly.
> Unfortunately executing tests I have the following error:
>
> compile-tests:
> [javac] Compiling 1 source file to
>
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/build/connector-tests/classes
> [javac]
>
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/connectors/kafka/connector/src/test/java/org/apache/manifoldcf/agents/output/kafka/ZooKeeperLocal.java:45:
> error: unreported exception AdminServerException; must be caught or
> declared to be thrown
> [javac]   zooKeeperServer.runFromConfig(configuration);
> [javac]^
> [javac] 1 error
>
> BUILD FAILED
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/build.xml:497:
> The following error occurred while executing this line:
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/build.xml:471:
> The following error occurred while executing this line:
>
> /Users/piergiorgiolucidi/Downloads/apache-manifoldcf-2.26/dist/connector-build.xml:720:
> Compile failed; see the compiler error output for details.
>
> Any ideas?
>
> Thanks.
>
> Cheers,
> PG
>
>
>
> Il giorno gio 12 ott 2023 alle ore 10:39 Mingchun Zhao <
> mingchun.zha...@gmail.com> ha scritto:
>
> > Hi Piergiorgio,
> >
> > FYI, Allow me to share the java and ant versions and the build steps I
> ran
> > in my environment.
> >
> > ```
> > $ java --version
> > openjdk 11.0.11 2021-04-20
> > OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
> > OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed
> > mode)
> >
> > $ ant -version
> > Apache Ant(TM) version 1.10.0 compiled on December 27 2016
> > ```
> > ```
> > ant clean
> > ant clean-deps
> > ant clean-core-deps
> > ant make-core-deps
> > ant make-deps
> > ant build
> > ```
> >
> > Regards,
> > Mingchun
> >
> > 2023年10月12日(木) 17:32 Piergiorgio Lucidi :
> >
> > > Hi folks,
> > >
> > > it seems that I can't compile the CswsConnector:
> > >
> > >[javac] public List
> getAttributeGroups()
> > > [javac]   ^
> > > [javac]   symbol:   class AttributeGroup
> > > [javac]   location: class CswsConnector.ObjectInformation
> > > [javac]
> > >
> > >
> >
> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/connectors/csws/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/csws/CswsConnector.java:3966:
> > > error: cannot find symbol
> > > [javac] public NodePermissions getPermissions()
> > > [javac]^
> > > [javac]   symbol:   class NodePermissions
> > > [javac]   location: class CswsConnector.ObjectInformation
> > > [javac] 100 errors
> > > [javac] 1 warning
> > > [javac] only showing the first 100 errors, of 123 total; use
> > -Xmaxerrs
> > > if you would like to see more
> > >
> > > BUILD FAILED
> > >
> >
> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/build.xml:489:
> > > The following error occurred while executing this line:
> > >
> >
> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/build.xml:471:
> > > The following error occurred while executing this line:
> > >
> > >
> >
> /Volumes/BackupPJ/ManifoldCF-release/apache-manifoldcf-2.26/dist/connector-build.xml:686:
> > > Compile failed; see the compiler error output for details.
> > >
> > > Do you have any ideas?
> > >
> > > Thanks,
> > > PG
> > >
> > > Il giorno gio 12 ott 2023 alle ore 09:03 Guylaine BASSETTE <
> > > guylaine.basse...@francelabs.com> ha scritto:
> > >
> > > > Hi,
> > > >
> > > > +1 from France Labs
> > > >
> > > > Regards,
> > > > Guylaine
> > > >
> > > > France Labs – Your knowledge, now
> > > > Datafari Ent

[VOTE] Release ManifoldCF 2.26, RC0

2023-10-11 Thread Karl Wright
Please vote on whether to release Apache ManifoldCF 2.26, RC0.
This release is the first release that requires at least Java 11, and it
also includes a new CSV connector along with support for Solr 9.  The
release artifact can be found at:
https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.26 ,
and there is a release tag also at
https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.26-RC0 .

Karl


Re: Ready for the 2.26 release?

2023-10-09 Thread Karl Wright
It's worse than that.
It looks like the native2ascii invocations have been stripped entirely from
the build.xml files everywhere, not just in ui-core.  I'm trying to figure
out when that happened and see if I can get it back.

Karl


On Mon, Oct 9, 2023 at 12:42 PM Olivier Tavard <
olivier.tav...@francelabs.com> wrote:

> Hi Karl,
>
> My colleague Guylaine noticed that one line was missing into
> framework/build.xml into the latest version :
>  dest="build/ui-core/classes" includes="**/*.properties" />
>
> So all the i18n translation files were missing into the build. The correct
> code is :
>
> 
> 
>  destdir="build/ui-core/classes" deprecation="true" target="1.8"
> source="1.8" debug="true" encoding="utf-8" debuglevel="lines,vars,source">
> 
> 
> 
> 
> 
>   *   src="ui-core/src/main/native2ascii" dest="build/ui-core/classes"
> includes="**/*.properties" />*
> 
>
> By adding this line, the build  is now OK. We can do a PR tomorrow for
> that if needed.
>
> Thanks,
>
>
> Olivier TAVARD
> Directeur Général - Cofondateur
> France Labs – Makers of Datafari Enteprise Search
> Datafari Enterprise Search <https://www.datafari.com>
> <https://www.milipol.com/Visiter/Inscription-visiteurs>
>
> <https://www.milipol.com/Visiter/Inscription-visiteurs>[image:
> image003.jpg]
>
> Le 9 oct. 2023 à 18:32, Karl Wright  a écrit :
>
> I looked very briefly at this and discovered that every message is
> affected.
> It may be due, therefore, to jetty refusing access to the translation
> resources.  But if that's the case I'm not going to be able to do anything
> to get this release out this month; I'm booked solid in fact until January.
>
> So good luck, folks.  I'd try rolling the Jetty version update back if you
> can as a first step.
>
> Karl
>
>
> On Mon, Oct 9, 2023 at 9:08 AM Karl Wright  wrote:
>
> No change to paths has been made.
> Probably the translation files have been corrupted due to many merges and
> perhaps bad encodings for some of them.  It will need to be looked into.
> Karl
>
>
> On Mon, Oct 9, 2023 at 8:40 AM Guylaine BASSETTE <
> guylaine.basse...@francelabs.com> wrote:
>
> Hello all,
>
> We have tested this version and everything is OK, except for translation,
> something is broken:
>
> It looks like the translations files are not found anymore. Maybe a
> change in the path to those files ?
> Best regards,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our
> version 5
> www.datafari.com
>
> Retrouvez-nous à Milipol <https://www.milipol.com/> du 14 au 17 novembre
>
> Le 06/10/2023 à 09:31, Karl Wright a écrit :
>
> Hi all,
>
> The tentative release schedule had a release going out on Sept 30th, which
> is now overdue.  Partly this was because of me, but also partly it's the
> result of new contributions from France Labs.  But these contributions are
> now committed to trunk and we could go ahead - unless others are expected
> to be coming shortly, in which case we should wait.  Please let me know.
>
> In any case, if I don't hear back by this weekend I will try to create a
> release candidate then.
>
> Karl
>
>
>


Re: Ready for the 2.26 release?

2023-10-09 Thread Karl Wright
I looked very briefly at this and discovered that every message is affected.
It may be due, therefore, to jetty refusing access to the translation
resources.  But if that's the case I'm not going to be able to do anything
to get this release out this month; I'm booked solid in fact until January.

So good luck, folks.  I'd try rolling the Jetty version update back if you
can as a first step.

Karl


On Mon, Oct 9, 2023 at 9:08 AM Karl Wright  wrote:

> No change to paths has been made.
> Probably the translation files have been corrupted due to many merges and
> perhaps bad encodings for some of them.  It will need to be looked into.
> Karl
>
>
> On Mon, Oct 9, 2023 at 8:40 AM Guylaine BASSETTE <
> guylaine.basse...@francelabs.com> wrote:
>
>> Hello all,
>>
>> We have tested this version and everything is OK, except for translation,
>> something is broken:
>>
>> It looks like the translations files are not found anymore. Maybe a
>> change in the path to those files ?
>> Best regards,
>> Guylaine
>>
>> France Labs – Your knowledge, now
>> Datafari Enterprise Search – Découvrez la version 5 / Discover our
>> version 5
>> www.datafari.com
>>
>> Retrouvez-nous à Milipol <https://www.milipol.com/> du 14 au 17 novembre
>>
>> Le 06/10/2023 à 09:31, Karl Wright a écrit :
>>
>> Hi all,
>>
>> The tentative release schedule had a release going out on Sept 30th, which
>> is now overdue.  Partly this was because of me, but also partly it's the
>> result of new contributions from France Labs.  But these contributions are
>> now committed to trunk and we could go ahead - unless others are expected
>> to be coming shortly, in which case we should wait.  Please let me know.
>>
>> In any case, if I don't hear back by this weekend I will try to create a
>> release candidate then.
>>
>> Karl
>>
>>
>>


Ready for the 2.26 release?

2023-10-06 Thread Karl Wright
Hi all,

The tentative release schedule had a release going out on Sept 30th, which
is now overdue.  Partly this was because of me, but also partly it's the
result of new contributions from France Labs.  But these contributions are
now committed to trunk and we could go ahead - unless others are expected
to be coming shortly, in which case we should wait.  Please let me know.

In any case, if I don't hear back by this weekend I will try to create a
release candidate then.

Karl


Re: Contribution : new Connector CSV

2023-10-06 Thread Karl Wright
The connector was committed to trunk just now.
Please verify that it is working properly.

For the documentation, we really need a pull request to update the
ManifoldCF documentation before we could include it.

Karl


On Tue, Oct 3, 2023 at 5:18 AM Furkan KAMACI  wrote:

> Thanks for your effort Guylaine!
>
> On 3 Oct 2023 Tue at 12:11 Guylaine BASSETTE <
> guylaine.basse...@francelabs.com> wrote:
>
> > Hello all,
> >
> > In order to remove the dependency towards the JDBC CSV driver (it does
> > not seem to be maintained anymore), we have refactored the CSV connector
> > to make it work independently from it. Can we propose you the
> > corresponding patch ? The documentation is there :
> >
> >
> https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/2894036993/CSV+Connector
> >
> > I have prepared a PR if you are interested:
> > https://github.com/apache/manifoldcf/pull/152
> >
> > --
> >
> > Best regards,
> > Guylaine
> >
> > France Labs – Your knowledge, now
> > Datafari Enterprise Search – Découvrez la version 5 / Discover our
> version
> > 5
> > www.datafari.com 
> >
> >
>


Re: (CONNECTORS-1740) Solr 9 output connector

2023-09-27 Thread Karl Wright
Glad you were able to figure it out.
I will have some time perhaps this weekend to merge a few pull requests and
maybe yours can be done as well.


On Wed, Sep 27, 2023 at 11:52 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hello,
>
> I have good news :-)
>
> I've been working on the ManifoldCF trunk with Zookeeper 3.8.0, and the
> error in the TestZooKeeperLocks test class has changed. Both tests
> failed because the ZK server wouldn't start.
>
> First of all, I found two missing dependencies that fix the tests, but
> only when run one by one. By running them in the test suite, the first
> test, whatever it was, led to the second test failing.
>
> It seems that the way of stopping the server in the context of the test
> is not effective. It's as if it prevents the second test from starting
> the server.
>
> There may be a more appropriate way of stopping the new Zookeeper, but
> with the time I had available, I thought it would be just as well to
> start/stop the server once for all the tests. So I suggested modifying
> the ZooKeeperBase class in that way.
>
> And now all the core-framework tests are OK !
>
>
> In a second step, I have included Mingchun Zhao's patch to migrate to
> Java 11.
>
> I have created a Pull Request :
> https://github.com/apache/manifoldcf/pull/150.
>
>
> Bien regards,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com <http://www.datafari.com>
>
>
> Le 05/09/2023 à 14:16, Karl Wright a écrit :
> > I don't have any special influence in the Zookeeper project I fear.
> >
> > On Tue, Sep 5, 2023 at 3:46 AM Guylaine BASSETTE <
> > guylaine.basse...@francelabs.com> wrote:
> >
> >> Hello Karl,
> >>
> >> Since we sent the ticket to Zookeeper, we've had no response from them.
> >> Can we ask you for a little help to move the subject forward?
> >>
> >> Best regards,
> >> Guylaine
> >>
> >> France Labs – Your knowledge, now
> >> Datafari Enterprise Search – Découvrez la version 5 / Discover our
> version
> >> 5
> >> www.datafari.com  <http://www.datafari.com>
> >> Le 18/07/2023 à 17:19, Guylaine BASSETTE a écrit :
> >>> Hello Karl,
> >>>
> >>> Thanks for your answer.
> >>>
> >>> We have created a Jira Bug to ZooKeeper:
> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-4722.
> >>>
> >>> They might contact you for help.
> >>>
> >>>
> >>> Best regards,
> >>> Guylaine
> >>>
> >>> France Labs – Your knowledge, now
> >>> Datafari Enterprise Search – Découvrez la version 5 / Discover our
> >>> version 5
> >>> www.datafari.com  <http://www.datafari.com>
> >>>
> >>>
> >>> Le 02/06/2023 à 01:04, Karl Wright a écrit :
> >>>> Okay, it's as I suspected, the Zookeeper update didn't change any
> >>>> functionality but just broke stuff.
> >>>>
> >>>> The first thing I'd do is alert the Solr team to the problem.  They
> >> should
> >>>> for now roll back their dependency so that an earlier Zookeeper is
> used.
> >>>> The next step would be to work with the Zookeeper team to use
> ManifoldCF
> >>>> unit tests to allow them to fix the problem, as you say. Rather than
> >>>> assuming this is the same problem we see in previous Zookeeper tickets
> >> (it
> >>>> probably is but we can't be sure of that), I'd create a new one
> >> describing
> >>>> very carefully how to reproduce this using a ManifoldCF branch
> checkout.
> >>>> Be prepared to interact with the Zookeeper team at some length about
> the
> >>>> problem and how to reproduce it.
> >>>>
> >>>> My sense is that Zookeeper's original authors are long gone and you
> may
> >> not
> >>>> get very far here.  And I have very limited time availability these
> >> days.
> >>>> If you are blocked in this in some way let me know and I will do my
> >> best to
> >>>> jump in and unblock you.
> >>>>
> >>>> I'd also fix the Solr 9 branch (after you make a copy of it for the
> >>>> Zookeeper folks) so that a working version of Zookeeper is downloaded
> >> and
> >>>> we can then merge that branch.  Please let me know when that is

Re: Documents Out Of Scope and hop count

2023-09-26 Thread Karl Wright
No, only the seed URLs get updated with that option.


On Tue, Sep 26, 2023 at 10:09 AM Marisol Redondo <
marisol.redondo.gar...@gmail.com> wrote:

> Thanks a lot for the explanation, Karl, really useful.
>
> I will wait for your reply at the end of the week, but I thought that the
> main reason for the option "Reset seeding" was for that, for reevaluating
> all pages, as a new fresh execution.
>
>
> On Tue, 26 Sept 2023 at 13:30, Karl Wright  wrote:
>
>> Okay, that is good to know.
>> The hopcount assessment occurs when documents are added to the queue.
>> Hopcounts are stored for each document in the hopcount table.  So if you
>> change a hopcount limit, it is quite possible that nothing will change
>> unless documents that are at the previous hopcount limit are re-evaluated.
>> I believe there is no logic in ManifoldCF for that at this time, but I'd
>> have to review the codebase to be certain of that.
>>
>> What that means is that you can't increase the hopcount limit and expect
>> the next crawl to pick up the documents you excluded before with the
>> hopcount mechanism.  Only when the documents need to be rescanned for some
>> other reason would that happen as it stands now.  But I will get back to
>> you after a review at the end of the week.
>>
>> Karl
>>
>> Karl
>>
>>
>> On Tue, Sep 26, 2023 at 8:04 AM Marisol Redondo <
>> marisol.redondo.gar...@gmail.com> wrote:
>>
>>> No, I haven't used this options, I have it configured as "Keep
>>> unreachable documents, for now", but it's also ignoring them because they
>>> were already kept?. With this option, when the unreachable document for now
>>> are converted to forever?
>>>
>>> The only solution I can think on is creating a new job with the exact
>>> same characteristics and run it.
>>>
>>> Regards and thanks
>>>Marisol
>>>
>>>
>>>
>>> On Tue, 26 Sept 2023 at 12:35, Karl Wright  wrote:
>>>
>>>> If you ever set "Ignore unreachable documents forever" for the job, you
>>>> can't go back and stop ignoring them.  The data that the job would need to
>>>> have recorded for this is gone.  The only way to get it back is if you can
>>>> convince the ManifoldCF to recrawl all documents in the job.
>>>>
>>>>
>>>> On Tue, Sep 26, 2023 at 4:51 AM Marisol Redondo <
>>>> marisol.redondo.gar...@gmail.com> wrote:
>>>>
>>>>>
>>>>> Hi, I had a problem with document out of scope
>>>>>
>>>>> I change the Maximum hop count for type "redirect" in one of my job to
>>>>> 5, and saw that the job is not processing some pages because of that, so I
>>>>> removed the value to get them injecting into the output connector (Solr
>>>>> connector)
>>>>> After that, the same pages are still out of scope like the limit has
>>>>> been set to 1, and they are not indexed.
>>>>>
>>>>> I have tried to "Reset seeding" thinking that maybe the pages need to
>>>>> be check again, but still having the same problem, I don't think the
>>>>> problem is with the output, but I have also use the option "Re-index all
>>>>> associated documents" and "Remove all associated records" with the same
>>>>> result
>>>>> I don't want to clear the history in the repository, that it's a
>>>>> website connector, as I don't want to lost all the history.
>>>>>
>>>>> Is this a bug in Manifold? Is there any option to fix this issue?
>>>>>
>>>>> I'm using Manifold version 2.24.
>>>>>
>>>>> Thanks
>>>>> Marisol
>>>>>
>>>>>


Re: Documents Out Of Scope and hop count

2023-09-26 Thread Karl Wright
Okay, that is good to know.
The hopcount assessment occurs when documents are added to the queue.
Hopcounts are stored for each document in the hopcount table.  So if you
change a hopcount limit, it is quite possible that nothing will change
unless documents that are at the previous hopcount limit are re-evaluated.
I believe there is no logic in ManifoldCF for that at this time, but I'd
have to review the codebase to be certain of that.

What that means is that you can't increase the hopcount limit and expect
the next crawl to pick up the documents you excluded before with the
hopcount mechanism.  Only when the documents need to be rescanned for some
other reason would that happen as it stands now.  But I will get back to
you after a review at the end of the week.

Karl

Karl


On Tue, Sep 26, 2023 at 8:04 AM Marisol Redondo <
marisol.redondo.gar...@gmail.com> wrote:

> No, I haven't used this options, I have it configured as "Keep unreachable
> documents, for now", but it's also ignoring them because they were already
> kept?. With this option, when the unreachable document for now are
> converted to forever?
>
> The only solution I can think on is creating a new job with the exact same
> characteristics and run it.
>
> Regards and thanks
>Marisol
>
>
>
> On Tue, 26 Sept 2023 at 12:35, Karl Wright  wrote:
>
>> If you ever set "Ignore unreachable documents forever" for the job, you
>> can't go back and stop ignoring them.  The data that the job would need to
>> have recorded for this is gone.  The only way to get it back is if you can
>> convince the ManifoldCF to recrawl all documents in the job.
>>
>>
>> On Tue, Sep 26, 2023 at 4:51 AM Marisol Redondo <
>> marisol.redondo.gar...@gmail.com> wrote:
>>
>>>
>>> Hi, I had a problem with document out of scope
>>>
>>> I change the Maximum hop count for type "redirect" in one of my job to
>>> 5, and saw that the job is not processing some pages because of that, so I
>>> removed the value to get them injecting into the output connector (Solr
>>> connector)
>>> After that, the same pages are still out of scope like the limit has
>>> been set to 1, and they are not indexed.
>>>
>>> I have tried to "Reset seeding" thinking that maybe the pages need to be
>>> check again, but still having the same problem, I don't think the problem
>>> is with the output, but I have also use the option "Re-index all associated
>>> documents" and "Remove all associated records" with the same result
>>> I don't want to clear the history in the repository, that it's a website
>>> connector, as I don't want to lost all the history.
>>>
>>> Is this a bug in Manifold? Is there any option to fix this issue?
>>>
>>> I'm using Manifold version 2.24.
>>>
>>> Thanks
>>> Marisol
>>>
>>>


Re: Documents Out Of Scope and hop count

2023-09-26 Thread Karl Wright
If you ever set "Ignore unreachable documents forever" for the job, you
can't go back and stop ignoring them.  The data that the job would need to
have recorded for this is gone.  The only way to get it back is if you can
convince the ManifoldCF to recrawl all documents in the job.


On Tue, Sep 26, 2023 at 4:51 AM Marisol Redondo <
marisol.redondo.gar...@gmail.com> wrote:

>
> Hi, I had a problem with document out of scope
>
> I change the Maximum hop count for type "redirect" in one of my job to 5,
> and saw that the job is not processing some pages because of that, so I
> removed the value to get them injecting into the output connector (Solr
> connector)
> After that, the same pages are still out of scope like the limit has been
> set to 1, and they are not indexed.
>
> I have tried to "Reset seeding" thinking that maybe the pages need to be
> check again, but still having the same problem, I don't think the problem
> is with the output, but I have also use the option "Re-index all associated
> documents" and "Remove all associated records" with the same result
> I don't want to clear the history in the repository, that it's a website
> connector, as I don't want to lost all the history.
>
> Is this a bug in Manifold? Is there any option to fix this issue?
>
> I'm using Manifold version 2.24.
>
> Thanks
> Marisol
>
>


Re: RE : Contribution to ManifoldCF webcrawler

2023-09-26 Thread Karl Wright
Looks good!
I will try to get this merged today.
Karl


On Mon, Sep 25, 2023 at 8:14 AM Karl Wright  wrote:

> Thanks.
> I will have a look at first opportunity.
> Karl
>
>
> On Mon, Sep 25, 2023 at 7:00 AM Emeric Bernet-Rollande <
> emeric.ber...@francelabs.com> wrote:
>
>> Hi,
>>
>> I opened a Pull Request, right here !
>> https://github.com/apache/manifoldcf/pull/149
>>
>> Regards,
>>
>> Emeric Bernet-Rollande
>>
>> France Labs – Your knowledge, now
>> Datafari Enterprise Search – Découvrez la version 5 / Discover our
>> version 5
>> www.datafari.com
>>
>> De : Furkan KAMACI
>> Envoyé le :lundi 25 septembre 2023 09:28
>> À : dev@manifoldcf.apache.org
>> Cc : olivier.tav...@francelabs.com; France Labs
>> Objet :Re: Contribution to ManifoldCF webcrawler
>>
>> Hi Emeric,
>>
>> First of all, thank you for your effort and suggestion. Do you have a Pull
>> Request for that improvement?
>>
>> Kind regards,
>> Furkan Kamaci
>>
>> On Mon, Sep 25, 2023 at 10:23 AM Emeric Bernet-Rollande <
>> emeric.ber...@francelabs.com> wrote:
>>
>> > Hi Karl and all !
>> >
>> >
>> >
>> > I’ve been working on the MCF webcrawler component for our Datafari
>> > project, and I made some developments that might interest the MCF
>> community.
>> >
>> >
>> >
>> > Currently if a website redirects the user with a code 301 or 302 and the
>> > « limit to seed is checked », the website (the one pointed by the
>> > redirection) won’t be indexed. We added an option  « Force the inclusion
>> > of redirections », which will override the previous checkbox if the
>> crawl
>> > encounters a redirection.
>> >
>> >
>> >
>> >
>> >
>> > Would you be interested in getting the patch to integrate it into
>> > ManifoldCF? The corresponding documentation can be found here:
>> >
>> https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1886879745/Web+Connectors
>> >
>> >
>> >
>> > Regards,
>> >
>> >
>> >
>> > Emeric Bernet-Rollande
>> >
>> >
>> >
>> > *France Labs – Your knowledge, now*
>> >
>> > Datafari Enterprise Search – Découvrez la version 5 / Discover our
>> version
>> > 5
>> > www.datafari.com
>> >
>> >
>> >
>>
>>


Re: web crawler https

2023-09-25 Thread Karl Wright
See this article:

https://stackoverflow.com/questions/6784463/error-trustanchors-parameter-must-be-non-empty

ManifoldCF web crawler configuration allows you to drop certs into a local
trust store for the connection.  You need to either do that (adding
whatever certificate authority cert you think might be missing), or by
checking the "trust https" checkbox.

You can generally debug what certs a site might need by trying to fetch a
page with curl and using verbose debug mode.

Karl


On Mon, Sep 25, 2023 at 10:48 AM Bisonti Mario 
wrote:

> Hi,
>
> I would like to try indexing a Wordpress internal site.
>
> I tried to configure Repository Web, Job with seeds but I always obtain:
>
>
>
> WARN 2023-09-25T16:31:50,905 (Worker thread '4') - Service interruption
> reported for job 1695649924581 connection 'Wp': IO exception
> (javax.net.ssl.SSLException)reading header: Unexpected error:
> java.security.InvalidAlgorithmParameterException: the trustAnchors
> parameter must be non-empty
>
>
>
> How could I solve?
>
> Thanks a lot
>
> Mario
>
>


Re: RE : Contribution to ManifoldCF webcrawler

2023-09-25 Thread Karl Wright
Thanks.
I will have a look at first opportunity.
Karl


On Mon, Sep 25, 2023 at 7:00 AM Emeric Bernet-Rollande <
emeric.ber...@francelabs.com> wrote:

> Hi,
>
> I opened a Pull Request, right here !
> https://github.com/apache/manifoldcf/pull/149
>
> Regards,
>
> Emeric Bernet-Rollande
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com
>
> De : Furkan KAMACI
> Envoyé le :lundi 25 septembre 2023 09:28
> À : dev@manifoldcf.apache.org
> Cc : olivier.tav...@francelabs.com; France Labs
> Objet :Re: Contribution to ManifoldCF webcrawler
>
> Hi Emeric,
>
> First of all, thank you for your effort and suggestion. Do you have a Pull
> Request for that improvement?
>
> Kind regards,
> Furkan Kamaci
>
> On Mon, Sep 25, 2023 at 10:23 AM Emeric Bernet-Rollande <
> emeric.ber...@francelabs.com> wrote:
>
> > Hi Karl and all !
> >
> >
> >
> > I’ve been working on the MCF webcrawler component for our Datafari
> > project, and I made some developments that might interest the MCF
> community.
> >
> >
> >
> > Currently if a website redirects the user with a code 301 or 302 and the
> > « limit to seed is checked », the website (the one pointed by the
> > redirection) won’t be indexed. We added an option  « Force the inclusion
> > of redirections », which will override the previous checkbox if the crawl
> > encounters a redirection.
> >
> >
> >
> >
> >
> > Would you be interested in getting the patch to integrate it into
> > ManifoldCF? The corresponding documentation can be found here:
> >
> https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/1886879745/Web+Connectors
> >
> >
> >
> > Regards,
> >
> >
> >
> > Emeric Bernet-Rollande
> >
> >
> >
> > *France Labs – Your knowledge, now*
> >
> > Datafari Enterprise Search – Découvrez la version 5 / Discover our
> version
> > 5
> > www.datafari.com
> >
> >
> >
>
>


Re: (CONNECTORS-1740) Solr 9 output connector

2023-09-05 Thread Karl Wright
I don't have any special influence in the Zookeeper project I fear.

On Tue, Sep 5, 2023 at 3:46 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hello Karl,
>
> Since we sent the ticket to Zookeeper, we've had no response from them.
> Can we ask you for a little help to move the subject forward?
>
> Best regards,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com <http://www.datafari.com>
> Le 18/07/2023 à 17:19, Guylaine BASSETTE a écrit :
> >
> > Hello Karl,
> >
> > Thanks for your answer.
> >
> > We have created a Jira Bug to ZooKeeper:
> > https://issues.apache.org/jira/browse/ZOOKEEPER-4722.
> >
> > They might contact you for help.
> >
> >
> > Best regards,
> > Guylaine
> >
> > France Labs – Your knowledge, now
> > Datafari Enterprise Search – Découvrez la version 5 / Discover our
> > version 5
> > www.datafari.com <http://www.datafari.com>
> >
> >
> > Le 02/06/2023 à 01:04, Karl Wright a écrit :
> >> Okay, it's as I suspected, the Zookeeper update didn't change any
> >> functionality but just broke stuff.
> >>
> >> The first thing I'd do is alert the Solr team to the problem.  They
> should
> >> for now roll back their dependency so that an earlier Zookeeper is used.
> >> The next step would be to work with the Zookeeper team to use ManifoldCF
> >> unit tests to allow them to fix the problem, as you say. Rather than
> >> assuming this is the same problem we see in previous Zookeeper tickets
> (it
> >> probably is but we can't be sure of that), I'd create a new one
> describing
> >> very carefully how to reproduce this using a ManifoldCF branch checkout.
> >> Be prepared to interact with the Zookeeper team at some length about the
> >> problem and how to reproduce it.
> >>
> >> My sense is that Zookeeper's original authors are long gone and you may
> not
> >> get very far here.  And I have very limited time availability these
> days.
> >> If you are blocked in this in some way let me know and I will do my
> best to
> >> jump in and unblock you.
> >>
> >> I'd also fix the Solr 9 branch (after you make a copy of it for the
> >> Zookeeper folks) so that a working version of Zookeeper is downloaded
> and
> >> we can then merge that branch.  Please let me know when that is done and
> >> I'll integrate that work.
> >>
> >> Thanks,
> >> Karl
> >>
> >>
> >> On Thu, Jun 1, 2023 at 5:56 AM Guylaine BASSETTE <
> >> guylaine.basse...@francelabs.com> wrote:
> >>
> >>> Hi Karl,
> >>>
> >>> Following up on your discussion with Julien. I did some further
> testings
> >>> and I’m commenting here because I cannot comment in the existing ticket
> >>> (
> >>>
> https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=17643980#comment-17643980
> >>> <
> >>>
> https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=17643980#comment-17643980
> >)
> >>>
> >>> . We tested the Solr 9 output connector using the ZK library in its
> >>> 3.5.6 version, targeting a Solr 9.2, and it worked so for now this
> >>> output connector can be considered as valid.
> >>>
> >>> Still, in the long run, I think this ZK bug will become an issue for
> >>> MCF. Since thanks to your testings, the problem can be reproduced,
> >>> wouldn’t it be worth for you to comment on their ZK issue, letting them
> >>> know that the issue is still here with ZK 3.5.7, that is does not only
> >>> happen in docker mode, and it can be reproduced every time using MCF
> >>> testing framework ?
> >>>
> >>> --
> >>>
> >>> Best Regards,
> >>> Guylaine
> >>>
> >>> France Labs – Your knowledge, now
> >>> Datafari Enterprise Search – Découvrez la version 5 / Discover our
> version
> >>> 5
> >>> www.datafari.com  <http://www.datafari.com>
> >>>
> >>>


Re: Solr connector authentication issue

2023-06-07 Thread Karl Wright
But if those are set, and the connection health check passes, then I can't
tell you why Solr is unhappy with your connection.  It's clearly working
sometimes.  I'd look on the Solr end to figure out whether its rejection is
coming from just one of your instances.



On Wed, Jun 7, 2023 at 7:49 AM Karl Wright  wrote:

> The Solr output connection configuration contains all credentials that are
> sent to Solr.  If those aren't set Solr won't get them.
>
> Karl
>
>
> On Wed, Jun 7, 2023 at 7:23 AM Marisol Redondo <
> marisol.redondo.gar...@gmail.com> wrote:
>
>> Hi,
>>
>> We are using Solr 8 with basic authentication, and when checking the
>> output connection I'm getting an Exception "Solr authorization failure,
>> code 401: aborting job"
>>
>> The solr type is Solrcloud, as we have 3 server (installed in AWS
>> Kubernette containers), I have set the user ID and password in the Sever
>> tab and can connect to Zookeeper and solr, as, if I unchecked the option
>> "Bock anonymous request", the connector is working.
>>
>> How can I make the connection working? I can't unchecked the "Block
>> anonymous request"
>> Am I missing any other configuration?
>> Is there any other place where I have to set the user and password?
>>
>> Thanks
>> Marisol
>>
>>


Re: Solr connector authentication issue

2023-06-07 Thread Karl Wright
The Solr output connection configuration contains all credentials that are
sent to Solr.  If those aren't set Solr won't get them.

Karl


On Wed, Jun 7, 2023 at 7:23 AM Marisol Redondo <
marisol.redondo.gar...@gmail.com> wrote:

> Hi,
>
> We are using Solr 8 with basic authentication, and when checking the
> output connection I'm getting an Exception "Solr authorization failure,
> code 401: aborting job"
>
> The solr type is Solrcloud, as we have 3 server (installed in AWS
> Kubernette containers), I have set the user ID and password in the Sever
> tab and can connect to Zookeeper and solr, as, if I unchecked the option
> "Bock anonymous request", the connector is working.
>
> How can I make the connection working? I can't unchecked the "Block
> anonymous request"
> Am I missing any other configuration?
> Is there any other place where I have to set the user and password?
>
> Thanks
> Marisol
>
>


Re: branches/CONNECTORS-1740

2023-06-07 Thread Karl Wright
That's because I changed the method call to include a "null" argument that
it didn't have before.

If all the versions are right I think that is all we need to do.

FWIW, the change does require JDK 11 and if you don't have your JAVA_HOME
set up to point to that it won't build correctly now.


On Tue, Jun 6, 2023 at 6:42 PM Mingchun Zhao 
wrote:

> Hi Karl,
>
> Thanks for your reply!
>
> I pulled the latest trunk and confirmed the jetty.version within build.xml
> and pom.xml was already updated to "9.4.48.v20220622".
>
> And then, I tried the build command:
>
> *ant clean clean-deps clean-core-deps make-core-deps make-deps build*
>
> with different combinations of jdk and ant versions and got different
> results.
> However, the build error below as you pointed out in your previous mail did
> not occur.
>
> *incompatible types: HttpClientTransport cannot be converted to
> SslContextFactory*
>
> ## jdk11.0.11 + ant1.10.13
>
> ==
>
>
>
> *BUILD SUCCESSFULTotal time: 1 minute 52 seconds==*
> ## jdk11.0.11 + ant1.8.2
>
> ==
> *BUILD FAILED*
> */Users/zhaomingchun/mcf/manifoldcf/build.xml:294: The following error
> occurred while executing this line:*
> */Users/zhaomingchun/mcf/manifoldcf/framework/build.xml:283: Error starting
> Sun's native2ascii:*
> *Total time: 5 seconds*
> ==
>
> ## jdk1.8.0_292 + ant1.8.2(Also ant1.10.13)
> *==*
> *compile-connector:*
> * [javac] Compiling 17 source files to
> /Users/zhaomingchun/mcf/manifoldcf/connectors/solr/build/connector/classes*
> * [javac]
>
> /Users/zhaomingchun/mcf/manifoldcf/connectors/solr/connector/src/main/java/org/apache/manifoldcf/agents/output/solr/ModifiedHttp2SolrClient.java:3:
> error: cannot access Utils*
> * [javac] import static
> org.apache.solr.common.util.Utils.getObjectByPath;*
> * [javac] ^*
> * [javac] bad class file:
>
> /Users/zhaomingchun/mcf/manifoldcf/lib/solr-solrj-9.1.0.jar(org/apache/solr/common/util/Utils.class)*
> * [javac] class file has wrong version 55.0, should be 52.0*
> * [javac] Please remove or make sure it appears in the correct
> subdirectory of the classpath.*
>
> *BUILD FAILED*
> */Users/zhaomingchun/mcf/manifoldcf/build.xml:487: The following error
> occurred while executing this line:*
> */Users/zhaomingchun/mcf/manifoldcf/build.xml:469: The following error
> occurred while executing this line:*
> */Users/zhaomingchun/mcf/manifoldcf/dist/connector-build.xml:686: Compile
> failed; see the compiler error output for details.*
>
> *Total time: 5 minutes 27 seconds*
> ==
>
> Could you please tell me which java and ant version you are using?
>
> Regards,
> Mingchun
>
>
> 2023年6月7日(水) 6:10 Karl Wright :
>
> > Interestingly I updated trunk by merging the branch, so I would have
> > expected the Jetty update to have happened properly.  You may want to
> check
> > why it didn't.
> >
> > Karl
> >
> >
> > On Tue, Jun 6, 2023 at 4:29 PM Mingchun Zhao 
> > wrote:
> >
> > > Hi Karl,
> > >
> > > Thanks for your email. About this compile error, I think there are two
> > > ways to fix it.
> > >
> > > 1. Change jetty.version within build.xml and pom.xml to
> > > 9.4.48.v20220622 same as CONNECTORS-1740 branch:
> > >
> > > build.xml
> > > - 
> > > + 
> > >
> > > pom.xml
> > > - 9.4.25.v20191220
> > > + 9.4.48.v20220622
> > >
> > > 2. Change the parameters of the HttpClient function within
> > > ModifiedHttp2SolrClient.java as below:
> > >
> > > - httpClient = sslEnabled ? new HttpClient(transport,
> > > sslContextFactory) : new HttpClient(transport);
> > > + httpClient = sslEnabled ? new HttpClient(transport,
> > > sslContextFactory) : new HttpClient(transport, null);
> > >
> > > The reason for this fix is that the constructor
> > > HttpClient(HttpClientTransport) does not exist in older jetty.version
> > > like 9.4.25.v20191220, so it seems that trying to use the constructor
> > > HttpClient(SslContextFactory) caused a conversion error.
> > >
> > >
> > >
> >
> https://www.javadoc.io/doc/org.eclipse.jetty/jetty-project/9.4.25.v20191220/org/eclipse/jetty/client/HttpClient.html#%3Cinit%3E(org.eclipse.jetty.client.HttpClientTransport,org.eclipse.jetty.util.ssl.SslContextFactory)
> > >
> > > Best Regards,
> > > Mingchun
> > >
> > > 2023年6月6日(火) 10:03 Karl Wright :
> > > >
> > > > Hi Mingchun,
> > > >
> > > > The previous work done on this br

Re: branches/CONNECTORS-1740

2023-06-06 Thread Karl Wright
Interestingly I updated trunk by merging the branch, so I would have
expected the Jetty update to have happened properly.  You may want to check
why it didn't.

Karl


On Tue, Jun 6, 2023 at 4:29 PM Mingchun Zhao 
wrote:

> Hi Karl,
>
> Thanks for your email. About this compile error, I think there are two
> ways to fix it.
>
> 1. Change jetty.version within build.xml and pom.xml to
> 9.4.48.v20220622 same as CONNECTORS-1740 branch:
>
> build.xml
> - 
> + 
>
> pom.xml
> - 9.4.25.v20191220
> + 9.4.48.v20220622
>
> 2. Change the parameters of the HttpClient function within
> ModifiedHttp2SolrClient.java as below:
>
> - httpClient = sslEnabled ? new HttpClient(transport,
> sslContextFactory) : new HttpClient(transport);
> + httpClient = sslEnabled ? new HttpClient(transport,
> sslContextFactory) : new HttpClient(transport, null);
>
> The reason for this fix is that the constructor
> HttpClient(HttpClientTransport) does not exist in older jetty.version
> like 9.4.25.v20191220, so it seems that trying to use the constructor
> HttpClient(SslContextFactory) caused a conversion error.
>
>
> https://www.javadoc.io/doc/org.eclipse.jetty/jetty-project/9.4.25.v20191220/org/eclipse/jetty/client/HttpClient.html#%3Cinit%3E(org.eclipse.jetty.client.HttpClientTransport,org.eclipse.jetty.util.ssl.SslContextFactory)
>
> Best Regards,
> Mingchun
>
> 2023年6月6日(火) 10:03 Karl Wright :
> >
> > Hi Mingchun,
> >
> > The previous work done on this branch is almost complete but there is
> still
> > a build error I get:
> >
> > [javac]
> >
> C:\wip\mcf\trunk\connectors\solr\connector\src\main\java\org\apache\manifoldcf\agents\output\solr\ModifiedHttp2SolrClient.java:200:
> > error: incompatible types: HttpClientTransport cannot be converted to
> > SslContextFactory
> > [javac]   httpClient = sslEnabled ? new HttpClient(transport,
> > sslContextFactory) : new HttpClient(transport);
> > [javac]
> >^
> >
> > This didn't show up until I merged the branch onto trunk.  I haven't yet
> > committed it because it doesn't quite build.  Any idea how to resolve
> this?
> >
> > Karl
>


branches/CONNECTORS-1740

2023-06-05 Thread Karl Wright
Hi Mingchun,

The previous work done on this branch is almost complete but there is still
a build error I get:

[javac]
C:\wip\mcf\trunk\connectors\solr\connector\src\main\java\org\apache\manifoldcf\agents\output\solr\ModifiedHttp2SolrClient.java:200:
error: incompatible types: HttpClientTransport cannot be converted to
SslContextFactory
[javac]   httpClient = sslEnabled ? new HttpClient(transport,
sslContextFactory) : new HttpClient(transport);
[javac]
   ^

This didn't show up until I merged the branch onto trunk.  I haven't yet
committed it because it doesn't quite build.  Any idea how to resolve this?

Karl


Re: (CONNECTORS-1740) Solr 9 output connector

2023-06-01 Thread Karl Wright
Okay, it's as I suspected, the Zookeeper update didn't change any
functionality but just broke stuff.

The first thing I'd do is alert the Solr team to the problem.  They should
for now roll back their dependency so that an earlier Zookeeper is used.
The next step would be to work with the Zookeeper team to use ManifoldCF
unit tests to allow them to fix the problem, as you say. Rather than
assuming this is the same problem we see in previous Zookeeper tickets (it
probably is but we can't be sure of that), I'd create a new one describing
very carefully how to reproduce this using a ManifoldCF branch checkout.
Be prepared to interact with the Zookeeper team at some length about the
problem and how to reproduce it.

My sense is that Zookeeper's original authors are long gone and you may not
get very far here.  And I have very limited time availability these days.
If you are blocked in this in some way let me know and I will do my best to
jump in and unblock you.

I'd also fix the Solr 9 branch (after you make a copy of it for the
Zookeeper folks) so that a working version of Zookeeper is downloaded and
we can then merge that branch.  Please let me know when that is done and
I'll integrate that work.

Thanks,
Karl


On Thu, Jun 1, 2023 at 5:56 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hi Karl,
>
> Following up on your discussion with Julien. I did some further testings
> and I’m commenting here because I cannot comment in the existing ticket
> (
> https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=17643980#comment-17643980
> <
> https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=17643980#comment-17643980>)
>
> . We tested the Solr 9 output connector using the ZK library in its
> 3.5.6 version, targeting a Solr 9.2, and it worked so for now this
> output connector can be considered as valid.
>
> Still, in the long run, I think this ZK bug will become an issue for
> MCF. Since thanks to your testings, the problem can be reproduced,
> wouldn’t it be worth for you to comment on their ZK issue, letting them
> know that the issue is still here with ZK 3.5.7, that is does not only
> happen in docker mode, and it can be reproduced every time using MCF
> testing framework ?
>
> --
>
> Best Regards,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com 
>
>


[RESULT][VOTE] Release ManifoldCF 2.25, RC0

2023-06-01 Thread Karl Wright
Three binding +1's, >72 hrs.  Vote passes!
Karl


On Thu, Jun 1, 2023 at 6:27 PM Karl Wright  wrote:

> +1 from me as well.
>
>
> On Thu, Jun 1, 2023 at 12:07 PM Karl Wright  wrote:
>
>> Hi -
>> This is a vote thread on a specific release artifact.  CONNECTORS-1746 is
>> indeed included in this release.
>>
>> Incorporating a JSON-based generic connector hasn't happened yet because
>> the contribution needed to be complete, and a release was requested before
>> that happened.
>>
>> Karl
>>
>>
>>
>>
>> On Thu, Jun 1, 2023 at 7:37 AM Guylaine BASSETTE <
>> guylaine.basse...@francelabs.com> wrote:
>>
>>> Hi all,
>>>
>>> Do you think it would make sense to include as well the following
>>> modifications into 2.25 ? They don’t require lots of modifications, but
>>> they would benefit everyone:
>>>
>>>   * the fix on CSV connector I proposed in mail "Control over number of
>>> processed documents per thread" on 2023/05/22
>>>   * (CONNECTORS-1740) Solr 9 output connector (mail today)
>>>
>>> As for my 2 other suggestions, I leave it up to you to decide.
>>>
>>>   * Json based generic authority connector (mail on 2023/05/23)
>>>   * Reading a document in Transfo Connector: Utility Classes (mail today)
>>>
>>> As a side note, did you also envision to include the optimisation on
>>> postgresql usage as proposed by Mingchun Zhao ?
>>> https://issues.apache.org/jira/browse/CONNECTORS-1746
>>>
>>>
>>> BTW, many thanks Mingchun for your 2 proposals on postgre!
>>>
>>>
>>> Bien cordialement,
>>> Guylaine
>>>
>>> France Labs – Your knowledge, now
>>> Datafari Enterprise Search – Découvrez la version 5 / Discover our
>>> version 5
>>> www.datafari.com <http://www.datafari.com>
>>>
>>>
>>> Le 30/05/2023 à 11:13, Mingchun Zhao a écrit :
>>> > +1 (non-binding)
>>> >
>>> > The following tests passed.
>>> > - Unit tests
>>> > - Integration tests with PostgreSQL
>>> > - Load tests with PostgreSQL
>>> > - New feature: the ability to disable hopcount tracking entirely, for
>>> > better performance of the web connector
>>> >
>>> > Regards,
>>> > Mingchun
>>> >
>>> > 2023年5月30日(火) 6:08 Karl Wright:
>>> >> Please vote on whether to release ManifoldCF 2.25, RC0.
>>> >>
>>> >> This release contains one new feature: the ability to disable hopcount
>>> >> tracking entirely, for better performance of the web connector.  The
>>> >> attempt to update the Solr connector to release 9.x of Solr did NOT
>>> make it
>>> >> in because that version of SolrJ depends on a broken version of
>>> zookeeper,
>>> >> our thread coordination library.
>>> >>
>>> >> A release artifact can be found here:
>>> >>
>>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.25
>>> >>
>>> >> A release tag can also be found at
>>> >> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.25-RC0  .
>>> >>
>>> >> Karl
>>
>>


Re: [VOTE] Release ManifoldCF 2.25, RC0

2023-06-01 Thread Karl Wright
+1 from me as well.


On Thu, Jun 1, 2023 at 12:07 PM Karl Wright  wrote:

> Hi -
> This is a vote thread on a specific release artifact.  CONNECTORS-1746 is
> indeed included in this release.
>
> Incorporating a JSON-based generic connector hasn't happened yet because
> the contribution needed to be complete, and a release was requested before
> that happened.
>
> Karl
>
>
>
>
> On Thu, Jun 1, 2023 at 7:37 AM Guylaine BASSETTE <
> guylaine.basse...@francelabs.com> wrote:
>
>> Hi all,
>>
>> Do you think it would make sense to include as well the following
>> modifications into 2.25 ? They don’t require lots of modifications, but
>> they would benefit everyone:
>>
>>   * the fix on CSV connector I proposed in mail "Control over number of
>> processed documents per thread" on 2023/05/22
>>   * (CONNECTORS-1740) Solr 9 output connector (mail today)
>>
>> As for my 2 other suggestions, I leave it up to you to decide.
>>
>>   * Json based generic authority connector (mail on 2023/05/23)
>>   * Reading a document in Transfo Connector: Utility Classes (mail today)
>>
>> As a side note, did you also envision to include the optimisation on
>> postgresql usage as proposed by Mingchun Zhao ?
>> https://issues.apache.org/jira/browse/CONNECTORS-1746
>>
>>
>> BTW, many thanks Mingchun for your 2 proposals on postgre!
>>
>>
>> Bien cordialement,
>> Guylaine
>>
>> France Labs – Your knowledge, now
>> Datafari Enterprise Search – Découvrez la version 5 / Discover our
>> version 5
>> www.datafari.com <http://www.datafari.com>
>>
>>
>> Le 30/05/2023 à 11:13, Mingchun Zhao a écrit :
>> > +1 (non-binding)
>> >
>> > The following tests passed.
>> > - Unit tests
>> > - Integration tests with PostgreSQL
>> > - Load tests with PostgreSQL
>> > - New feature: the ability to disable hopcount tracking entirely, for
>> > better performance of the web connector
>> >
>> > Regards,
>> > Mingchun
>> >
>> > 2023年5月30日(火) 6:08 Karl Wright:
>> >> Please vote on whether to release ManifoldCF 2.25, RC0.
>> >>
>> >> This release contains one new feature: the ability to disable hopcount
>> >> tracking entirely, for better performance of the web connector.  The
>> >> attempt to update the Solr connector to release 9.x of Solr did NOT
>> make it
>> >> in because that version of SolrJ depends on a broken version of
>> zookeeper,
>> >> our thread coordination library.
>> >>
>> >> A release artifact can be found here:
>> >>
>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.25
>> >>
>> >> A release tag can also be found at
>> >> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.25-RC0  .
>> >>
>> >> Karl
>
>


Re: [VOTE] Release ManifoldCF 2.25, RC0

2023-06-01 Thread Karl Wright
Hi -
This is a vote thread on a specific release artifact.  CONNECTORS-1746 is
indeed included in this release.

Incorporating a JSON-based generic connector hasn't happened yet because
the contribution needed to be complete, and a release was requested before
that happened.

Karl




On Thu, Jun 1, 2023 at 7:37 AM Guylaine BASSETTE <
guylaine.basse...@francelabs.com> wrote:

> Hi all,
>
> Do you think it would make sense to include as well the following
> modifications into 2.25 ? They don’t require lots of modifications, but
> they would benefit everyone:
>
>   * the fix on CSV connector I proposed in mail "Control over number of
> processed documents per thread" on 2023/05/22
>   * (CONNECTORS-1740) Solr 9 output connector (mail today)
>
> As for my 2 other suggestions, I leave it up to you to decide.
>
>   * Json based generic authority connector (mail on 2023/05/23)
>   * Reading a document in Transfo Connector: Utility Classes (mail today)
>
> As a side note, did you also envision to include the optimisation on
> postgresql usage as proposed by Mingchun Zhao ?
> https://issues.apache.org/jira/browse/CONNECTORS-1746
>
>
> BTW, many thanks Mingchun for your 2 proposals on postgre!
>
>
> Bien cordialement,
> Guylaine
>
> France Labs – Your knowledge, now
> Datafari Enterprise Search – Découvrez la version 5 / Discover our version
> 5
> www.datafari.com <http://www.datafari.com>
>
>
> Le 30/05/2023 à 11:13, Mingchun Zhao a écrit :
> > +1 (non-binding)
> >
> > The following tests passed.
> > - Unit tests
> > - Integration tests with PostgreSQL
> > - Load tests with PostgreSQL
> > - New feature: the ability to disable hopcount tracking entirely, for
> > better performance of the web connector
> >
> > Regards,
> > Mingchun
> >
> > 2023年5月30日(火) 6:08 Karl Wright:
> >> Please vote on whether to release ManifoldCF 2.25, RC0.
> >>
> >> This release contains one new feature: the ability to disable hopcount
> >> tracking entirely, for better performance of the web connector.  The
> >> attempt to update the Solr connector to release 9.x of Solr did NOT
> make it
> >> in because that version of SolrJ depends on a broken version of
> zookeeper,
> >> our thread coordination library.
> >>
> >> A release artifact can be found here:
> >>
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.25
> >>
> >> A release tag can also be found at
> >> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.25-RC0  .
> >>
> >> Karl


[jira] [Resolved] (CONNECTORS-1746) Adding conditions to execute PostgreSQL's ANALYZE command to avoid crawling become extremely slow.

2023-06-01 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1746.
-
Fix Version/s: ManifoldCF 2.25
   Resolution: Fixed

> Adding conditions to execute PostgreSQL's ANALYZE command to avoid crawling 
> become extremely slow.
> --
>
> Key: CONNECTORS-1746
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1746
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
> Environment: Using ManifoldCF 2.24 with PostgreSQL 12.14 as the 
> database. 
>Reporter: Mingchun Zhao
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.25
>
> Attachments: DBInterfacePostgreSQL.java.patch
>
>
> Sometimes, the crawling does not process any documents for a while and there 
> is nothing logged about long-running queries. The performance can be restored 
> by firing the 'ANALYZE' command manually. It seems that a bad query plan 
> caused this performance problem.
> Therefore, in addition to the current configuration parameter 
> 'org.apache.manifoldcf.db.postgres.analyze.', it is considered 
> necessary to execute the 'ANALYZE' even in the following situations.
> 1. When the number of records in the table exceeds the number required for 
> creating a execution plan after the job starts.
> 2. When the crawling performance slows down. For example, if the processing 
> rate of documents drops below a specified threshold.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] Release ManifoldCF 2.25, RC0

2023-05-29 Thread Karl Wright
Please vote on whether to release ManifoldCF 2.25, RC0.

This release contains one new feature: the ability to disable hopcount
tracking entirely, for better performance of the web connector.  The
attempt to update the Solr connector to release 9.x of Solr did NOT make it
in because that version of SolrJ depends on a broken version of zookeeper,
our thread coordination library.

A release artifact can be found here:
https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.25

A release tag can also be found at
https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.25-RC0 .

Karl


[jira] [Commented] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-27 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17726821#comment-17726821
 ] 

Karl Wright commented on CONNECTORS-1747:
-

I am away from fast internet until Monday.  I will put up a release candidate 
then and call a vote.


> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.25
>
> Attachments: CONNECTORS-1747.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Long Job on Windows Share

2023-05-25 Thread Karl Wright
The jcifs connector does not include a lot of information in the version
string for a file - basically, the length, and the modified date.  So I
would not expect there to be lot of actual work involved if there are no
changes to a document.

The activity "access" does imply that the system believes that the document
does need to be reindexed.  It clearly reads the document properly.  I
would check to be sure it actually indexes the document.  I suspect that
your job may be reading the file but determining it is not suitable for
indexing and then repeating that every day.  You can see this by looking
for the document in the activity log to see what ManifoldCF decided to do
with it.

Karl



On Thu, May 25, 2023 at 6:03 AM Bisonti Mario 
wrote:

> Hi,
>
> I would like to understand how recrawl works
>
>
>
> My job scan, using “Connection Type”  “Windows shares” works for near 18
> hours.
>
> My document numebr a little bit of 1 million.
>
>
>
> If I check the documents scan from MifoldCF I see, for example:
>
>
>
> It seems that re work on the document every day even if it hadn’t been
> modified.
>
> So, is it right or I chose a wrong job to crawl the documents?
>
>
>
> Thanks a lot
>
> Mario
>
>
>
>
>


[jira] [Commented] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725989#comment-17725989
 ] 

Karl Wright commented on CONNECTORS-1747:
-

I can put up a release candidate easily enough; however, it may be hard to get 
a voting quorum.  That's the issue these days in this project.


> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.25
>
> Attachments: CONNECTORS-1747.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1747:

Fix Version/s: ManifoldCF 2.25
   (was: ManifoldCF next)

> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.25
>
> Attachments: CONNECTORS-1747.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1747.
-
Fix Version/s: ManifoldCF next
   Resolution: Fixed

r1910036

> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF next
>
> Attachments: CONNECTORS-1747.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725921#comment-17725921
 ] 

Karl Wright commented on CONNECTORS-1747:
-

This looks good.  I'll try to commit it tonight.


> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1747.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-21 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724654#comment-17724654
 ] 

Karl Wright commented on CONNECTORS-1747:
-

Hi - so just to be clear, what you need to do here is:
(1) Introduce a property, as you have done, that disables support for hopcount 
handling completely.  It obviously should be a global cluster property, not a 
local one.
(2) When that property is set, the HopCount.java class should never record 
anything in the intrinsicLinks or HopCount tables at all.
(3) When that property is set, the Hopcount tab should not appear in the UI for 
any job.


> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Attachments: JobManager.java.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-21 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724652#comment-17724652
 ] 

Karl Wright commented on CONNECTORS-1747:
-

[~mingchun.zhao], it will be necessary to also disable the hopcount tab for all 
jobs entirely if you set this flag, since essentially the installation no 
longer can track hopcount at all.  Please include that in your commit, thanks.




> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Attachments: JobManager.java.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CONNECTORS-1747) Add a property to disable logging hop count to database

2023-05-21 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1747:
---

Assignee: Karl Wright

> Add a property to disable logging hop count to database
> ---
>
> Key: CONNECTORS-1747
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1747
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Mingchun Zhao
>    Assignee: Karl Wright
>Priority: Major
> Attachments: JobManager.java.patch
>
>
> If we do not require “Hop Filters“ feature, we need to consider to disable 
> logging records related to hopcount to database like "intrinsiclink" and 
> "hopcount" tables. This can increase throughput and reduce the rate of growth 
> of the database.
> I will try to create a patch for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: About disabling hopcount tracking

2023-05-21 Thread Karl Wright
For some reason I did not see any emails from you for a full 10 days after
you sent them.  I wonder why this was?  Perhaps Apache infrastructure was
misbehaving but I apologize for the late response.


On Sun, May 21, 2023 at 8:59 AM Karl Wright  wrote:

> Hi - the big source of bloat for hopcount processing is the delete
> dependencies table, and the options provided allow you to not track those
> at all.  The other tables (intrinsiclink and hopcount) are 1:1 with the
> documents themselves, so these were not considered worth optimizing.
>
> It may be possible to introduce a fourth hopcount mode that did not record
> any information in those tables - but since this can be changed on a job,
> very careful analysis would need to be done to figure out what happens when
> someone flips that setting after a crawl has already been run.
>
> Karl
>
>
> On Thu, May 11, 2023 at 2:28 AM Mingchun Zhao 
> wrote:
>
>> Hi Karl,
>>
>> Thank you for taking time out of your busy schedule to reply.
>>
>> > There is an option on the "hopcount" tab of your job to disable hopcount
>>
>> You mean setting "Hop count mode" to "keep unreachable documents,
>> forever" in the "Hop Filters" tab?
>> Yes, I did it, however, it seems that the records were still inserted
>> into the "intrinsiclink" and "hopcount" tables. Is there a way to tell
>> MCF not to insert data into those tables because operations on it can
>> become a performance bottleneck when the tables bloat?
>>
>> Regards,
>> Mingchun
>>
>> 2023年5月10日(水) 19:53 Karl Wright :
>> >
>> > There is an option on the "hopcount" tab of your job to disable hopcount
>> > tracking entirely.
>> > Karl
>> >
>> > On Tue, May 9, 2023 at 11:49 PM Mingchun Zhao <
>> mingchun.zha...@gmail.com>
>> > wrote:
>> >
>> > > Hi Karl,
>> > >
>> > > Could you please advise me on tracking hopcount.
>> > > I'm using ManifoldCF 2.24 with PostgreSQL 12.14 as the database for
>> now.
>> > > In my case, I don't need to use the 'Hop Filters' feature so I'd like
>> > > to disable tracking hopcount and reduce the insert/update/delete load
>> > > on the 'intrinsiclink' and 'hopcount' tables. So I have two questions
>> > > about this.
>> > > First, is there an option to disable tracking hopcount?
>> > > Second, if I disable tracking hopcount , can it affect other crawling
>> > > processes?
>> > >
>> > > Thank you in advance.
>> > > Kind regards,
>> > > Mingchun
>> > >
>>
>


Re: About disabling hopcount tracking

2023-05-21 Thread Karl Wright
Hi - the big source of bloat for hopcount processing is the delete
dependencies table, and the options provided allow you to not track those
at all.  The other tables (intrinsiclink and hopcount) are 1:1 with the
documents themselves, so these were not considered worth optimizing.

It may be possible to introduce a fourth hopcount mode that did not record
any information in those tables - but since this can be changed on a job,
very careful analysis would need to be done to figure out what happens when
someone flips that setting after a crawl has already been run.

Karl


On Thu, May 11, 2023 at 2:28 AM Mingchun Zhao 
wrote:

> Hi Karl,
>
> Thank you for taking time out of your busy schedule to reply.
>
> > There is an option on the "hopcount" tab of your job to disable hopcount
>
> You mean setting "Hop count mode" to "keep unreachable documents,
> forever" in the "Hop Filters" tab?
> Yes, I did it, however, it seems that the records were still inserted
> into the "intrinsiclink" and "hopcount" tables. Is there a way to tell
> MCF not to insert data into those tables because operations on it can
> become a performance bottleneck when the tables bloat?
>
> Regards,
> Mingchun
>
> 2023年5月10日(水) 19:53 Karl Wright :
> >
> > There is an option on the "hopcount" tab of your job to disable hopcount
> > tracking entirely.
> > Karl
> >
> > On Tue, May 9, 2023 at 11:49 PM Mingchun Zhao  >
> > wrote:
> >
> > > Hi Karl,
> > >
> > > Could you please advise me on tracking hopcount.
> > > I'm using ManifoldCF 2.24 with PostgreSQL 12.14 as the database for
> now.
> > > In my case, I don't need to use the 'Hop Filters' feature so I'd like
> > > to disable tracking hopcount and reduce the insert/update/delete load
> > > on the 'intrinsiclink' and 'hopcount' tables. So I have two questions
> > > about this.
> > > First, is there an option to disable tracking hopcount?
> > > Second, if I disable tracking hopcount , can it affect other crawling
> > > processes?
> > >
> > > Thank you in advance.
> > > Kind regards,
> > > Mingchun
> > >
>


[jira] [Commented] (CONNECTORS-1746) Adding conditions to execute PostgreSQL's ANALYZE command to avoid crawling become extremely slow.

2023-05-12 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722261#comment-17722261
 ] 

Karl Wright commented on CONNECTORS-1746:
-

Patch committed: r1909780


> Adding conditions to execute PostgreSQL's ANALYZE command to avoid crawling 
> become extremely slow.
> --
>
> Key: CONNECTORS-1746
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1746
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
> Environment: Using ManifoldCF 2.24 with PostgreSQL 12.14 as the 
> database. 
>Reporter: Mingchun Zhao
>Assignee: Karl Wright
>Priority: Major
> Attachments: DBInterfacePostgreSQL.java.patch
>
>
> Sometimes, the crawling does not process any documents for a while and there 
> is nothing logged about long-running queries. The performance can be restored 
> by firing the 'ANALYZE' command manually. It seems that a bad query plan 
> caused this performance problem.
> Therefore, in addition to the current configuration parameter 
> 'org.apache.manifoldcf.db.postgres.analyze.', it is considered 
> necessary to execute the 'ANALYZE' even in the following situations.
> 1. When the number of records in the table exceeds the number required for 
> creating a execution plan after the job starts.
> 2. When the crawling performance slows down. For example, if the processing 
> rate of documents drops below a specified threshold.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: About disabling hopcount tracking

2023-05-10 Thread Karl Wright
There is an option on the "hopcount" tab of your job to disable hopcount
tracking entirely.
Karl

On Tue, May 9, 2023 at 11:49 PM Mingchun Zhao 
wrote:

> Hi Karl,
>
> Could you please advise me on tracking hopcount.
> I'm using ManifoldCF 2.24 with PostgreSQL 12.14 as the database for now.
> In my case, I don't need to use the 'Hop Filters' feature so I'd like
> to disable tracking hopcount and reduce the insert/update/delete load
> on the 'intrinsiclink' and 'hopcount' tables. So I have two questions
> about this.
> First, is there an option to disable tracking hopcount?
> Second, if I disable tracking hopcount , can it affect other crawling
> processes?
>
> Thank you in advance.
> Kind regards,
> Mingchun
>


[jira] [Commented] (CONNECTORS-1740) Solr 9 output connector

2023-04-14 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712329#comment-17712329
 ] 

Karl Wright commented on CONNECTORS-1740:
-

It looks like there's an existing and unresolved Zookeeper issue for this:

https://issues.apache.org/jira/browse/ZOOKEEPER-3828

Unfortunately this makes going to Solr 9 a no-go, unless we can we use an older 
version of Zookeeper without this problem.  Have you tried that?



> Solr 9 output connector
> ---
>
> Key: CONNECTORS-1740
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1740
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Affects Versions: ManifoldCF 2.23
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Major
>
> The current Solr output connector is not compatible with Solr 9.x
> We need to update the connector with SolrJ 9 and make sure that the custom 
> code (multipart post requests, basic/preemptive auth) is still required, and, 
> in case it is, port it ! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1740) Solr 9 output connector

2023-04-14 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712327#comment-17712327
 ] 

Karl Wright commented on CONNECTORS-1740:
-

Zookeeper tests are failing because Zookeeper is not doing what it is supposed 
to.  Basically, sessions are being destroyed so fast that the test cannot 
actually do anything useful:

{code}
[junit] org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client 
session timed out, have not heard from server in 2007ms for session id 0x0
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1250)
[junit] [reader] INFO org.apache.zookeeper.ZooKeeper - Session: 0x0 closed
[junit] [reader-EventThread] INFO org.apache.zookeeper.ClientCnxn - 
EventThread shut down for session: 0x0
[junit] [reader] INFO org.apache.zookeeper.ZooKeeper - Initiating client 
connection, connectString=localhost:8348 sessionTimeout=2000 
watcher=org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection$ZooKeeperWatcher@45c66a27
[junit] [reader] INFO org.apache.zookeeper.ClientCnxnSocket - 
jute.maxbuffer value is 1048575 Bytes
[junit] [reader] INFO org.apache.zookeeper.ClientCnxn - 
zookeeper.request.timeout value is 0. feature enabled=false
[junit] [reader-SendThread(localhost:8348)] INFO 
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
localhost/127.0.0.1:8348.
[junit] [reader-SendThread(localhost:8348)] INFO 
org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to 
authenticate using SASL (unknown error)
[junit] [reader-SendThread(localhost:8348)] WARN 
org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from 
server in 2008ms for session id 0x0
[junit] [reader-SendThread(localhost:8348)] WARN 
org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send 
thread for session 0x0.
[junit] org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client 
session timed out, have not heard from server in 2008ms for session id 0x0
{code}

This is, of course, fatal to ManifoldCF's use of Zookeeper for synchronization. 
 So we need to figure out how to work around this new "feature", or the whole 
project is a no-go.


> Solr 9 output connector
> ---
>
> Key: CONNECTORS-1740
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1740
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Affects Versions: ManifoldCF 2.23
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Major
>
> The current Solr output connector is not compatible with Solr 9.x
> We need to update the connector with SolrJ 9 and make sure that the custom 
> code (multipart post requests, basic/preemptive auth) is still required, and, 
> in case it is, port it ! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1740) Solr 9 output connector

2023-04-14 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17712321#comment-17712321
 ] 

Karl Wright commented on CONNECTORS-1740:
-

Turns out that this update also will require we move permanently from Java 8 to 
Java 11:

{code}
compile-connector:
[javac] Compiling 17 source files to 
C:\wip\mcf\CONNECTORS-1740\connectors\solr\build\connector\classes
[javac] 
C:\wip\mcf\CONNECTORS-1740\connectors\solr\connector\src\main\java\org\apache\manifoldcf\agents\output\solr\ModifiedHttp2SolrClient.java:3:
 error: cannot access Utils
[javac] import static org.apache.solr.common.util.Utils.getObjectByPath;
[javac]  ^
[javac]   bad class file: 
C:\wip\mcf\CONNECTORS-1740\lib\solr-solrj-9.1.0.jar(org/apache/solr/common/util/Utils.class)
[javac] class file has wrong version 55.0, should be 52.0
[javac] Please remove or make sure it appears in the correct 
subdirectory of the classpath.
{code}


> Solr 9 output connector
> ---
>
> Key: CONNECTORS-1740
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1740
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Affects Versions: ManifoldCF 2.23
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Major
>
> The current Solr output connector is not compatible with Solr 9.x
> We need to update the connector with SolrJ 9 and make sure that the custom 
> code (multipart post requests, basic/preemptive auth) is still required, and, 
> in case it is, port it ! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CONNECTORS-1740) Solr 9 output connector

2023-04-12 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17711593#comment-17711593
 ] 

Karl Wright commented on CONNECTORS-1740:
-

[~julienFL], I'll look at the zookeeper changes.


> Solr 9 output connector
> ---
>
> Key: CONNECTORS-1740
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1740
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Affects Versions: ManifoldCF 2.23
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Major
>
> The current Solr output connector is not compatible with Solr 9.x
> We need to update the connector with SolrJ 9 and make sure that the custom 
> code (multipart post requests, basic/preemptive auth) is still required, and, 
> in case it is, port it ! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CONNECTORS-1740) Solr 9 output connector

2023-04-12 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1740:
---

Assignee: Julien Massiera

> Solr 9 output connector
> ---
>
> Key: CONNECTORS-1740
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1740
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Affects Versions: ManifoldCF 2.23
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Major
>
> The current Solr output connector is not compatible with Solr 9.x
> We need to update the connector with SolrJ 9 and make sure that the custom 
> code (multipart post requests, basic/preemptive auth) is still required, and, 
> in case it is, port it ! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Apache Manifold Documentum connector

2023-03-17 Thread Karl Wright
It was open-sourced back in 2012 at the same time ManifoldCF was
open-sourced.  It was written by a contractor paid by MetaCarta, who also
paid for the development of ManifoldCF itself (I developed that).  It was
spun off as open source when MetaCarta was bought by Nokia who had no
interest in the framework or the connectors.

I do not, off the top of my head, remember the contractor's name nor have
his contact information any longer.

There are many users of the Documentum Connector, however, and I would hope
one of them with more DQL experience will respond.

Karl



On Fri, Mar 17, 2023 at 5:41 AM Rasťa Šíša  wrote:

> Hi Karl, thanks for your answer! Would you be able to point me towards the
> author/git branch of the documentum connector?
> Best regards, Rasta
>
> čt 16. 3. 2023 v 20:58 odesílatel Karl Wright  napsal:
>
>> Hi,
>>
>> I didn't write the documentum connector initially, so I trust that the
>> engineer who did knew how to construct the proper DQL.  I've not seen any
>> bugs related to it so it does seem to work.
>>
>> Karl
>>
>>
>> On Thu, Mar 16, 2023 at 8:23 AM Rasťa Šíša  wrote:
>>
>>> Hello,
>>> i would like to ask how does Documentum Manifold connector select the
>>> latest version from Documentum system?
>>>
>>> The first query that gets composed collects list of i_chronicle_id in
>>> DCTM.java. I would like to know though, how does the Manifold recognize the
>>> latest version of the document(e.g. Effective status).
>>> From the ui, i am able to select some of the objecttypes, but not
>>> objecttypes (all).
>>>
>>> In dql it is just e.g.
>>> *select i_chronicle_id from   *
>>> instead of *select i_chronicle_id from  (all)
>>> . *
>>>
>>> This "(all) object" returns all of them. The internal functioning of
>>> documentum though, with the first type of query, does not select
>>> i_chronicle_id of documents, that have a newly created version e.g. the
>>> document is created approved and effective, but someone already created a
>>> new draft for it. with the (all) in the dql, it brings in all the documents
>>> and their r_object_id, among which we can select the effective version by
>>> status.
>>> Is this a bug in manifold documentum connector, that it does not allow
>>> you to select those (all) objects and select those documents with new
>>> versions?
>>> Best regards,
>>> Rastislav Sisa
>>>
>>


Re: Apache Manifold Documentum connector

2023-03-16 Thread Karl Wright
Hi,

I didn't write the documentum connector initially, so I trust that the
engineer who did knew how to construct the proper DQL.  I've not seen any
bugs related to it so it does seem to work.

Karl


On Thu, Mar 16, 2023 at 8:23 AM Rasťa Šíša  wrote:

> Hello,
> i would like to ask how does Documentum Manifold connector select the
> latest version from Documentum system?
>
> The first query that gets composed collects list of i_chronicle_id in
> DCTM.java. I would like to know though, how does the Manifold recognize the
> latest version of the document(e.g. Effective status).
> From the ui, i am able to select some of the objecttypes, but not
> objecttypes (all).
>
> In dql it is just e.g.
> *select i_chronicle_id from   *
> instead of *select i_chronicle_id from  (all)
> . *
>
> This "(all) object" returns all of them. The internal functioning of
> documentum though, with the first type of query, does not select
> i_chronicle_id of documents, that have a newly created version e.g. the
> document is created approved and effective, but someone already created a
> new draft for it. with the (all) in the dql, it brings in all the documents
> and their r_object_id, among which we can select the effective version by
> status.
> Is this a bug in manifold documentum connector, that it does not allow you
> to select those (all) objects and select those documents with new versions?
> Best regards,
> Rastislav Sisa
>


Re: Job stucked with cleaning up status

2023-02-03 Thread Karl Wright
he loop making thread terminates normally! In a quite a
> short time I always ends up with no `DocumentDeleteThread`s at all and the
> framework transit to the incosistent state.
>
> In the end, I made Solr back online and managed to finish deletion
> successfully. But I think this case should be handled in some way.
>
> With respect,
> Abeleshev Artem
>
> On Sun, Jan 29, 2023 at 10:36 PM Karl Wright  wrote:
>
>> Hi,
>>
>> 2.22 makes no changes to the way document deletions are processed over
>> probably 10 previous versions of ManifoldCF.
>>
>> What likely is the case is that the connection to the output for the job
>> you are cleaning up is down.  When that happens, the documents are queued
>> but the delete worker threads cannot make any progress.
>>
>> You can see this maybe by looking at the "Simple Reports" for the job in
>> question and see what it is doing and why the deletions are not succeeding.
>>
>> Karl
>>
>>
>> On Sun, Jan 29, 2023 at 8:18 AM Artem Abeleshev <
>> artem.abeles...@rondhuit.com> wrote:
>>
>>> Hi, everyone!
>>>
>>> Another problem that I got sometimes. We are using ManifoldCF 2.22.1
>>> with multiple nodes in our production. The creation of the MCF job pipeline
>>> is handled via the API calls from our service. We create jobs, repositories
>>> and output repositories. The crawler extracts documents and then they are
>>> pushed to the Solr. The pipeline works OK.
>>>
>>> The problem is about deleteing the job. Sometimes the job get stucked
>>> with a `Cleaning up` status (in DB it has status `e` that corresponds to
>>> status `STATUS_DELETING`). This time I have used MCF Web Admin to delete
>>> the job (pressed the delete button on the job list page).
>>>
>>> I have checked sources and debug it a bit. The method
>>> `deleteJobsReadyForDelete()`
>>> (`org.apache.manifoldcf.crawler.jobs.JobManager.deleteJobsReadyForDelete()`)
>>> is works OK. It is unable to delete the job cause it still found some
>>> documents in the document's queue table. The following SQL is executed
>>> within this method:
>>>
>>> ```sql
>>> select id from jobqueue where jobid = '1658215015582' and (status = 'E'
>>> or status = 'D') limit 1;
>>> ```
>>>
>>> where `E` status stands for `STATUS_ELIGIBLEFORDELETE` and `D` status
>>> stands for `STATUS_BEINGDELETED`. If at least one of such a documents is
>>> found in the queue it will do nothing. At the moment I had a lot of
>>> documents resided within the `jobqueue` having indicated statuses (actually
>>> all of them have `D` status).
>>>
>>> I see that `Documents delete stuffer thread` is running, and it set
>>> status `STATUS_BEINGDELETED` to the documents via the
>>> `getNextDeletableDocuments()` method
>>> (`org.apache.manifoldcf.crawler.jobs.JobManager.getNextDeletableDocuments(String,
>>> int, long)`). But I can't find any logic that actually deletes the
>>> documents. I've searched throught the sources, but status
>>> `STATUS_BEINGDELETED` mentioned mostly in `NOT EXISTS ...` queries.
>>> Searching in reverse order from `JobQueue`
>>> (`org.apache.manifoldcf.crawler.jobs.JobQueue`) also doesn't give result to
>>> me. I will be appreciated if somewone can point where to look, so I can
>>> debug and check what conditions are preventing documents to be removed.
>>>
>>> Thank you!
>>>
>>> With respect,
>>> Artem Abeleshev
>>>
>>


Re: JCIFS: Possibly transient exception detected on attempt 1 while getting share security: All pipe instances are busy

2023-02-01 Thread Karl Wright
It looks like you are running with a profiler?  That uses a lot of memory.
Karl


On Wed, Feb 1, 2023 at 8:06 AM Bisonti Mario 
wrote:

> This is my hs_err_pid_.log
>
>
>
> Command Line: -Xms32768m -Xmx32768m
> -Dorg.apache.manifoldcf.configfile=./properties.xml
> -Djava.security.auth.login.con
>
> fig= -Dorg.apache.manifoldcf.processid=A
> org.apache.manifoldcf.agents.AgentRun
>
>
>
> .
>
> .
>
> .
>
> CodeHeap 'non-profiled nmethods': size=120032Kb used=23677Kb
> max_used=23677Kb free=96354Kb
>
> CodeHeap 'profiled nmethods': size=120028Kb used=20405Kb max_used=27584Kb
> free=99622Kb
>
> CodeHeap 'non-nmethods': size=5700Kb used=1278Kb max_used=1417Kb
> free=4421Kb
>
> Memory: 4k page, physical 72057128k(7300332k free), swap 4039676k(4039676k
> free)
>
> .
>
> .
>
>
>
> Perhaps could be a RAM problem?
>
>
>
> Thanks a lot
>
>
>
>
>
>
>
>
>
> *Da:* Bisonti Mario
> *Inviato:* venerdì 20 gennaio 2023 10:28
> *A:* user@manifoldcf.apache.org
> *Oggetto:* R: JCIFS: Possibly transient exception detected on attempt 1
> while getting share security: All pipe instances are busy
>
>
>
> I see that the agent crashed:
>
> #
>
> # A fatal error has been detected by the Java Runtime Environment:
>
> #
>
> #  Internal Error (g1ConcurrentMark.cpp:1665), pid=2537463, tid=2537470
>
> #  fatal error: Overflow during reference processing, can not continue.
> Please increase MarkStackSizeMax (current value: 16777216) and restart.
>
> #
>
> # JRE version: OpenJDK Runtime Environment (11.0.16+8) (build
> 11.0.16+8-post-Ubuntu-0ubuntu120.04)
>
> # Java VM: OpenJDK 64-Bit Server VM (11.0.16+8-post-Ubuntu-0ubuntu120.04,
> mixed mode, tiered, g1 gc, linux-amd64)
>
> # Core dump will be written. Default location: Core dumps may be processed
> with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E"
> (or dumping to
> /opt/manifoldcf/multiprocess-zk-example-proprietary/core.2537463)
>
> #
>
> # If you would like to submit a bug report, please visit:
>
> #   https://bugs.launchpad.net/ubuntu/+source/openjdk-lts
>
> #
>
>
>
> ---  S U M M A R Y 
>
>
>
> Command Line: -Xms32768m -Xmx32768m
> -Dorg.apache.manifoldcf.configfile=./properties.xml
> -Djava.security.auth.login.config= -Dorg.apache.manifoldcf.processid=A
> org.apache.manifoldcf.agents.AgentRun
>
>
>
> Host: Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz, 8 cores, 68G, Ubuntu
> 20.04.4 LTS
>
> Time: Fri Jan 20 09:38:54 2023 CET elapsed time: 54532.106681 seconds (0d
> 15h 8m 52s)
>
>
>
> ---  T H R E A D  ---
>
>
>
> Current thread (0x7f051940a000):  VMThread "VM Thread" [stack:
> 0x7f051c50a000,0x7f051c60a000] [id=2537470]
>
>
>
> Stack: [0x7f051c50a000,0x00007f051c60a000],  sp=0x7f051c608080,
> free space=1016k
>
> Native frames: (J=compiled Java code, A=aot compiled Java code,
> j=interpreted, Vv=VM code, C=native code)
>
> V  [libjvm.so+0xe963a9]
>
> V  [libjvm.so+0x67b504]
>
> V  [libjvm.so+0x7604e6]
>
>
>
>
>
> So, where could I change that parameter?
>
> Is it an Agent configuration?
>
> Thanks a lot
>
> Mario
>
>
>
>
>
> *Da:* Karl Wright 
> *Inviato:* mercoledì 18 gennaio 2023 14:59
> *A:* user@manifoldcf.apache.org
> *Oggetto:* Re: JCIFS: Possibly transient exception detected on attempt 1
> while getting share security: All pipe instances are busy
>
>
>
> When you get a hang like this, getting a thread dump of the agents process
> is essential to figure out what the issue is.  You can't assume that a
> transient error would block anything because that's not how ManifoldCF
> works, at all.  Errors push the document in question back onto the queue
> with a retry time.
>
>
>
> Karl
>
>
>
>
>
> On Wed, Jan 18, 2023 at 6:15 AM Bisonti Mario 
> wrote:
>
> Hi Karl.
>
> But I noted that the job was hanging, the document processed was stucked
> on the same number, no further document processing from the 6 a.m until I
> restart Agent
>
>
>
>
>
>
>
>
>
> *Da:* Karl Wright 
> *Inviato:* mercoledì 18 gennaio 2023 12:10
> *A:* user@manifoldcf.apache.org
> *Oggetto:* Re: JCIFS: Possibly transient exception detected on attempt 1
> while getting share security: All pipe instances are busy
>
>
>
> Hi, "Possibly transient issue" means that the error will be retried
> anyway, according to a schedule.  There should not need to be any
> requirement to shut down the agents proce

Re: Job stucked with cleaning up status

2023-01-29 Thread Karl Wright
Hi,

2.22 makes no changes to the way document deletions are processed over
probably 10 previous versions of ManifoldCF.

What likely is the case is that the connection to the output for the job
you are cleaning up is down.  When that happens, the documents are queued
but the delete worker threads cannot make any progress.

You can see this maybe by looking at the "Simple Reports" for the job in
question and see what it is doing and why the deletions are not succeeding.

Karl


On Sun, Jan 29, 2023 at 8:18 AM Artem Abeleshev <
artem.abeles...@rondhuit.com> wrote:

> Hi, everyone!
>
> Another problem that I got sometimes. We are using ManifoldCF 2.22.1 with
> multiple nodes in our production. The creation of the MCF job pipeline is
> handled via the API calls from our service. We create jobs, repositories
> and output repositories. The crawler extracts documents and then they are
> pushed to the Solr. The pipeline works OK.
>
> The problem is about deleteing the job. Sometimes the job get stucked with
> a `Cleaning up` status (in DB it has status `e` that corresponds to status
> `STATUS_DELETING`). This time I have used MCF Web Admin to delete the job
> (pressed the delete button on the job list page).
>
> I have checked sources and debug it a bit. The method
> `deleteJobsReadyForDelete()`
> (`org.apache.manifoldcf.crawler.jobs.JobManager.deleteJobsReadyForDelete()`)
> is works OK. It is unable to delete the job cause it still found some
> documents in the document's queue table. The following SQL is executed
> within this method:
>
> ```sql
> select id from jobqueue where jobid = '1658215015582' and (status = 'E' or
> status = 'D') limit 1;
> ```
>
> where `E` status stands for `STATUS_ELIGIBLEFORDELETE` and `D` status
> stands for `STATUS_BEINGDELETED`. If at least one of such a documents is
> found in the queue it will do nothing. At the moment I had a lot of
> documents resided within the `jobqueue` having indicated statuses (actually
> all of them have `D` status).
>
> I see that `Documents delete stuffer thread` is running, and it set status
> `STATUS_BEINGDELETED` to the documents via the
> `getNextDeletableDocuments()` method
> (`org.apache.manifoldcf.crawler.jobs.JobManager.getNextDeletableDocuments(String,
> int, long)`). But I can't find any logic that actually deletes the
> documents. I've searched throught the sources, but status
> `STATUS_BEINGDELETED` mentioned mostly in `NOT EXISTS ...` queries.
> Searching in reverse order from `JobQueue`
> (`org.apache.manifoldcf.crawler.jobs.JobQueue`) also doesn't give result to
> me. I will be appreciated if somewone can point where to look, so I can
> debug and check what conditions are preventing documents to be removed.
>
> Thank you!
>
> With respect,
> Artem Abeleshev
>


Re: JCIFS: Possibly transient exception detected on attempt 1 while getting share security: All pipe instances are busy

2023-01-18 Thread Karl Wright
When you get a hang like this, getting a thread dump of the agents process
is essential to figure out what the issue is.  You can't assume that a
transient error would block anything because that's not how ManifoldCF
works, at all.  Errors push the document in question back onto the queue
with a retry time.

Karl


On Wed, Jan 18, 2023 at 6:15 AM Bisonti Mario 
wrote:

> Hi Karl.
>
> But I noted that the job was hanging, the document processed was stucked
> on the same number, no further document processing from the 6 a.m until I
> restart Agent
>
>
>
>
>
>
>
>
>
> *Da:* Karl Wright 
> *Inviato:* mercoledì 18 gennaio 2023 12:10
> *A:* user@manifoldcf.apache.org
> *Oggetto:* Re: JCIFS: Possibly transient exception detected on attempt 1
> while getting share security: All pipe instances are busy
>
>
>
> Hi, "Possibly transient issue" means that the error will be retried
> anyway, according to a schedule.  There should not need to be any
> requirement to shut down the agents process and restart.
>
> Karl
>
>
>
> On Wed, Jan 18, 2023 at 5:08 AM Bisonti Mario 
> wrote:
>
> Hi.
>
> Often, I obtain the error:
>
> WARN 2023-01-18T06:18:19,316 (Worker thread '89') - JCIFS: Possibly
> transient exception detected on attempt 1 while getting share security: All
> pipe instances are busy.
>
> jcifs.smb.SmbException: All pipe instances are busy.
>
> at
> jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1441)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1552)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1007)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1523)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:399)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:314)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:294)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbFile.openUnshared(SmbFile.java:665)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.smb.SmbPipeHandleImpl.ensureOpen(SmbPipeHandleImpl.java:169)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.smb.SmbPipeHandleImpl.sendrecv(SmbPipeHandleImpl.java:250)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.dcerpc.DcerpcPipeHandle.doSendReceiveFragment(DcerpcPipeHandle.java:113)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.dcerpc.DcerpcHandle.sendrecv(DcerpcHandle.java:243)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.dcerpc.DcerpcHandle.bind(DcerpcHandle.java:216)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.dcerpc.DcerpcHandle.sendrecv(DcerpcHandle.java:234)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbFile.getShareSecurity(SmbFile.java:2337)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.getFileShareSecurity(SharedDriveConnector.java:2468)
> [mcf-jcifs-connector.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.getFileShareSecuritySet(SharedDriveConnector.java:1243)
> [mcf-jcifs-connector.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:647)
> [mcf-jcifs-connector.jar:?]
>
>
>
> So, I have to stop the agent, restart it, and the crwling continues.
>
>
>
> How could I solve my issue?
> Thanks a lot.
>
> Mario
>
>


Re: JCIFS: Possibly transient exception detected on attempt 1 while getting share security: All pipe instances are busy

2023-01-18 Thread Karl Wright
Hi, "Possibly transient issue" means that the error will be retried anyway,
according to a schedule.  There should not need to be any requirement to
shut down the agents process and restart.
Karl

On Wed, Jan 18, 2023 at 5:08 AM Bisonti Mario 
wrote:

> Hi.
>
> Often, I obtain the error:
>
> WARN 2023-01-18T06:18:19,316 (Worker thread '89') - JCIFS: Possibly
> transient exception detected on attempt 1 while getting share security: All
> pipe instances are busy.
>
> jcifs.smb.SmbException: All pipe instances are busy.
>
> at
> jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1441)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1552)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1007)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1523)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:399)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:314)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:294)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbFile.openUnshared(SmbFile.java:665)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.smb.SmbPipeHandleImpl.ensureOpen(SmbPipeHandleImpl.java:169)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.smb.SmbPipeHandleImpl.sendrecv(SmbPipeHandleImpl.java:250)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> jcifs.dcerpc.DcerpcPipeHandle.doSendReceiveFragment(DcerpcPipeHandle.java:113)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.dcerpc.DcerpcHandle.sendrecv(DcerpcHandle.java:243)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.dcerpc.DcerpcHandle.bind(DcerpcHandle.java:216)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.dcerpc.DcerpcHandle.sendrecv(DcerpcHandle.java:234)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at jcifs.smb.SmbFile.getShareSecurity(SmbFile.java:2337)
> ~[jcifs-ng-2.1.2.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.getFileShareSecurity(SharedDriveConnector.java:2468)
> [mcf-jcifs-connector.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.getFileShareSecuritySet(SharedDriveConnector.java:1243)
> [mcf-jcifs-connector.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:647)
> [mcf-jcifs-connector.jar:?]
>
>
>
> So, I have to stop the agent, restart it, and the crwling continues.
>
>
>
> How could I solve my issue?
> Thanks a lot.
>
> Mario
>


Re: Lucene 9.5.0 release

2023-01-17 Thread Karl Wright
+1 from me.

On Tue, Jan 17, 2023 at 11:32 AM Uwe Schindler  wrote:

> +1
>
> Am 13.01.2023 um 10:54 schrieb Luca Cavanna:
> > Hi all,
> > I'd like to propose that we release Lucene 9.5.0. There is a decent
> > amount of changes that would go into it looking at the github
> > milestone: https://github.com/apache/lucene/milestone/4 . I'd
> > volunteer to be the release manager. There is one PR open listed for
> > the 9.5 milestone: https://github.com/apache/lucene/pull/11873 . Is
> > this something that we do want to address before we release? Is
> > anybody aware of outstanding work that we would like to include or
> > known blocker issues that are not listed in the 9.5 milestone?
> >
> > Cheers
> > Luca
> >
> >
> >
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Help for subscribing the user mailing list of MCF

2023-01-10 Thread Karl Wright
Hmm - I haven't heard of difficulties like this before.  The mail manager
is used apache-wide; if it doesn't work the best thing to do would be to
create an infra ticket in JIRA.

Karl


On Tue, Jan 10, 2023 at 3:50 AM Koji Sekiguchi 
wrote:

> Hi Karl, everyone!
>
> I'm writing to the moderator of the MCF mailing list.
>
> I'd like you to help my colleague to subscribe to MCF user mailing list.
> He's tried to subscribe several times by sending the request to
> user-subscr...@manifoldcf.apache.org but he said that it seemed that
> they were just ignored and he couldn't get any responses from the
> system.
> The email address is abeleshev at gmail dot com.
>
> He has some questions and wants to contribute something if possible.
>
> Thanks!
>
> Koji
>


[jira] [Assigned] (CONNECTORS-1743) The Solr Output Connector should retry on a 502 Bad Gateway or 503 Service Unavailable

2023-01-09 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1743:
---

Assignee: Karl Wright

> The Solr Output Connector should retry on a 502 Bad Gateway or 503 Service 
> Unavailable
> --
>
> Key: CONNECTORS-1743
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1743
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Markus Günther
>    Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1743.patch
>
>
> The Solr Output Connector (cf. HttpPoster) triggers a retry if the Solr 
> response returns with HTTP status code 500. This behavior should be extended 
> to include a 502 Bad Gateway as well as 503 Service Unavailable as well, as 
> these indicate transient error situations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Is Manifold capable of handling these kind of files

2022-12-23 Thread Karl Wright
The internals of ManifoldCF will handle this fine if you are sure to set
the encoding of your database to be UTF-8.  However, I don't know about the
JCIFS library, and whether there might be a restriction on characters in
that code base.  I think you'd have to just try it and see, frankly.

Karl


On Fri, Dec 23, 2022 at 6:52 AM Priya Arora  wrote:

> Hi
>
> Is Manifold capable of handling this kind (ingesting) of file in window
> shares connector which has special characters like these
>
> demo/11208500/11208550/I. Proposal/PHASE II/220808
> Input/__MACOSX/虎尾/._62A33A6377CF08B472CC2AB562BD8B5D.JPG
>
>
> Any reply would be appreciated
>


[jira] [Resolved] (CONNECTORS-1741) Documentation no longer available?

2022-12-06 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1741.
-
Fix Version/s: ManifoldCF 2.24
   Resolution: Fixed

Was resolved by hand-pushing the required files to the mirror svn.


> Documentation no longer available?
> --
>
> Key: CONNECTORS-1741
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1741
> Project: ManifoldCF
>  Issue Type: Wish
>  Components: Documentation
>Reporter: Len G
>    Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.24
>
>
> The documentation on the website no longer seems to be available. When I try 
> to access the Release URLs at 
> [https://manifoldcf.apache.org/en_US/release-documentation.html], they are 
> not available (404 Not Found).
> Am I missing something here or looking in the wrong location? They used to be 
> there.
> Thanks
>  
> __PRESENT
> __PRESENT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Site release 2.24 docs do NOT contain release 2.24 docs, even though release artifact has them

2022-12-05 Thread Karl Wright
Problem fixed.  It looks like the python script I use to update the site in
svn is somehow broken for new files; they do not get added as they should
anymore.  I will have to figure out why but for now 2.24 release
documentation is live on the site.

Karl


On Mon, Dec 5, 2022 at 6:50 AM Piergiorgio Lucidi 
wrote:

> Hi Karl,
>
> Thank you for this update.
> It seems so strange because I think no one updated that part.
>
> Cheers,
> PJ
>
> Il giorno lun 5 dic 2022 alle ore 12:42 Karl Wright 
> ha
> scritto:
>
> > Hi,
> >
> > I updated the release site yesterday evening with appropriate 2.24
> > references, and checked before I mirrored it that the site as built did
> > correctly contain the 2.24 release documents.  But when I uploaded to the
> > publish svn directory and committed these, the release documents all went
> > away. (!!!).  I don't have any explanation whatsoever for this at the
> > moment.  It will require further research when I have a chance to do so.
> >
> > Karl
> >
>
>
> --
> Piergiorgio
>


[jira] [Commented] (CONNECTORS-1741) Documentation no longer available?

2022-12-05 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643466#comment-17643466
 ] 

Karl Wright commented on CONNECTORS-1741:
-

Infrastructure changes have apparently prevented us from publishing the release 
documentation on the site.  Need to research why that is happening.  Looked 
fine locally.


> Documentation no longer available?
> --
>
> Key: CONNECTORS-1741
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1741
> Project: ManifoldCF
>  Issue Type: Wish
>  Components: Documentation
>Reporter: Len G
>    Assignee: Karl Wright
>Priority: Minor
>
> The documentation on the website no longer seems to be available. When I try 
> to access the Release URLs at 
> [https://manifoldcf.apache.org/en_US/release-documentation.html], they are 
> not available (404 Not Found).
> Am I missing something here or looking in the wrong location? They used to be 
> there.
> Thanks
>  
> __PRESENT
> __PRESENT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CONNECTORS-1741) Documentation no longer available?

2022-12-05 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1741:
---

Assignee: Karl Wright

> Documentation no longer available?
> --
>
> Key: CONNECTORS-1741
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1741
> Project: ManifoldCF
>  Issue Type: Wish
>  Components: Documentation
>Reporter: Len G
>    Assignee: Karl Wright
>Priority: Minor
>
> The documentation on the website no longer seems to be available. When I try 
> to access the Release URLs at 
> [https://manifoldcf.apache.org/en_US/release-documentation.html], they are 
> not available (404 Not Found).
> Am I missing something here or looking in the wrong location? They used to be 
> there.
> Thanks
>  
> __PRESENT
> __PRESENT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Site release 2.24 docs do NOT contain release 2.24 docs, even though release artifact has them

2022-12-05 Thread Karl Wright
Hi,

I updated the release site yesterday evening with appropriate 2.24
references, and checked before I mirrored it that the site as built did
correctly contain the 2.24 release documents.  But when I uploaded to the
publish svn directory and committed these, the release documents all went
away. (!!!).  I don't have any explanation whatsoever for this at the
moment.  It will require further research when I have a chance to do so.

Karl


Re: Release documentation missing for 2.23

2022-12-02 Thread Karl Wright
Hi,
I'm in the middle of several work-related escalations so will not respond
until Sunday or Monday.  Apologies in advance.
Karl


On Fri, Dec 2, 2022 at 3:11 AM Markus Schuch  wrote:

> Hi Karl,
>
> when i run `ant doc` on the release tag, i still get an error:
>
> Forrest generates a linkmap for broken links
>
> linkmap.html  BROKEN: Content is not allowed in prolog.
>
> site\build\site\broken-links.xml contains
>
> ```
> 
>linkmap.html
> 
> ```
>
> Do you know how to fix/debug this? I find no reference to a broken file
> in the build log.
>
> May be this is unrelated to the issues you found with the site
> generation. I wanted to share this, just to be sure we generate the
> documentation properly this time.
>
> Markus
>
>
> Am 29.11.2022 um 01:11 schrieb Karl Wright:
> > The release doesn't contain the built documentation for some reason.  Not
> > sure why.  Can't be fixed until a new release is built.
> > Karl
> >
> >
> > On Mon, Nov 28, 2022 at 7:08 PM Karl Wright  wrote:
> >
> >> Sorry, misread.  The documentation is updated by unpacking the releases
> >> and including the built documentation from each release in the
> appropriate
> >> place.  I will check to see whether the documentation didn't properly
> >> unpack or something during the build.
> >>
> >> Karl
> >>
> >>
> >> On Mon, Nov 28, 2022 at 7:06 PM Karl Wright  wrote:
> >>
> >>>
> >>>
> >>> On Mon, Nov 28, 2022 at 7:03 PM Karl Wright 
> wrote:
> >>>
> >>>> This URL
> https://manifoldcf.apache.org/en_US/release-documentation.html
> >>>> does not produce a 404 for me.  Are you sure?
> >>>>
> >>>> Karl
> >>>>
> >>>>
> >>>> On Mon, Nov 28, 2022 at 6:06 PM Markus Schuch 
> >>>> wrote:
> >>>>
> >>>>> The release documentation for ManifoldCF 2.23 is missing.
> >>>>>
> >>>>> The link at
> >>>>> https://manifoldcf.apache.org/en_US/release-documentation.html
> >>>>> produces 404.
> >>>>>
> >>>>> Do we have documentation how to properly release the documentation?
> >>>>>
> >>>>> Cheers
> >>>>> Markus
> >>>>>
> >>>>
> >
>


Re: Solr 9.x output connector

2022-12-01 Thread Karl Wright
Feel free to commit your proposed changes to a branch for evaluation.
Karl

On Thu, Dec 1, 2022 at 9:34 AM Julien Massiera <
julien.massi...@francelabs.com> wrote:

> Hi Karl,
>
> I did a quick alpha version of a Solr 9 connector to test: I can confirm
> that it works with older Solr versions !
>
> HOWEVER, in SolrJ 9, the new Solr client has been reimplemented: it now
> prevents to easily customize the httpClient and the way it performs
> requests. This makes it very challenging - at least for me - to port all
> of the custom code concerning the multipart post requests, as well as
> the basic and preemptive auth of the current Solr connector! Who knows,
> maybe with this new SolrJ client, those custom codes have become useless
> and now the multipart/basic/preemptive auth work OOTB... Unfortunatly, I
> don't have time to test whether those functionalities work OOTB, not to
> mention that I don't have a test environment to give it a try. Maybe the
> MCF committers of these solr related updates could give it a look if I
> commit a final version of the connector on a dedicated branch ?
>
> Julien
>
> On 29/11/2022 22:35, Karl Wright wrote:
> > Hi Julien,
> >
> > Sorry for the delay; I've been under intense pressure at work of late and
> > just saw this email now.
> >
> > Regarding library updates: we should generally go ahead and do those
> > FIRST.  There are custom fixes for httpclient checked into the ManifoldCF
> > code base so we may need to work a little to get those to build properly.
> > But I'm reasonably sure it can be done.  Libraries are backwards
> compatible
> > at the minor version level so all is good there.  When somebody wants to
> go
> > to HttpClient 5, though, we are in trouble.
> >
> > AFTER that is done we should evaluate whether the 9.x Solr library is
> > backwards compatible enough with 8.x to work.  We had to do very little
> to
> > go from 7.x to 8.x, so unless the Solr people suddenly changed their
> > philosophy dramatically, it should be possible to do this too.  But we
> will
> > see.
> >
> > Karl
> >
> >
> > On Tue, Nov 29, 2022 at 9:59 AM Julien Massiera <
> > julien.massi...@francelabs.com> wrote:
> >
> >> Hi Karl,
> >>
> >> the Solr output connector does not seem to work with Solr 9.x according
> >> to our tests. We are going to either update or develop a new connector
> >> but there is a problem concerning the libraries required. A solr 9.x
> >> connector will of course involve a solrj 9.x lib but also the update of
> >> the following libs in MCF:
> >>
> >> - zookeeper from 3.4.10 to >= 3.7.0 (current 3.8.0)
> >> - httpcomponent.httpclient.version from 4.5.3 to 4.5.13
> >> - httpcomponent.httpcore.version from 4.4.6 to 4.4.15
> >> - httpcomponent.httpmime.version from 4.5.3 to 4.5.13
> >>
> >> Those updates should not cause problems to other connectors in MCF, the
> >> real problem here concerns the current Solr connector as I am not sure
> >> that an updated version would be compatible with a Solr < 9.x.
> >> There is also the modified solr clients using the custom multi-parts
> >> http post methods that will cause some troubles to be ported on Solrj 9
> >> .x according to me.
> >>
> >> If I am not wrong, historically those custom clients were developed to
> >> avoid errors with the embedded Tika of Solr for some documents. But
> >> IMHO, it has become a challenge that is not worth the effort: the way to
> >> go should be to have the documents processed by Tika BEFORE the Solr
> >> indexation. Not to mention that the tika embedded in Solr is too old
> >> (1.28.1) and will most certainly be removed someday (as stated in this
> >> tickethttps://issues.apache.org/jira/browse/SOLR-13973). Thus, I think
> >> it is not worth it to port the custom solr clients in the new connector.
> >> This would ease the creation of the Solr 9 output connector.
> >>
> >> Whatever happens, if we want to maintain output connectors for different
> >> versions of Solr, and IF the Solr 9 output connector is not compatible
> >> with previous versions of Solr (still needs to be checked), we'll end up
> >> with several versions of the libs in ManifoldCF. To be honest, I do not
> >> see a proper way to deal with the libs conflicts between the two
> >> connectors...
> >>
> >> What do you think ?
> >>
> >> Regards,
> >> Julien
> >>
> --
> Julien MASSIERA
> Directeur développement produit
> France Labs – Les experts du Search
> Datafari – Vainqueur du trophée Big Data 2018 au Digital Innovation Makers
> Summit
> www.francelabs.com
>
>


  1   2   3   4   5   6   7   8   9   10   >