OK,
I compiled Nutch under JDK11
Did some basic fetching, parsing, linkinversion and posterior indexing to Solr 9
[+1]

Great work!
RRK

On Tue, Aug 30, 2022 at 12:22 PM BlackIce <blackice...@gmail.com> wrote:
>
> Tried some indexing... but when manually doing "Invertilinks" it says
> something about input path does not exist.
> Has invertilinks changed since 1.18?
>
> Greetz
> RRK
>
> On Mon, Aug 29, 2022 at 3:38 PM BlackIce <blackice...@gmail.com> wrote:
> >
> > Haven't indexed anything to solr.. gonna give it a shot in a few hours
> >
> > On Mon, Aug 29, 2022 at 2:17 PM Markus Jelsma
> > <markus.jel...@openindex.io> wrote:
> > >
> > > Hello Sebastian,
> > >
> > > No, the JAR isn't present. Multiple JARs are missing, probably because 
> > > they
> > > are loaded after httpasyncclient. I checked the previously emptied Ivy
> > > cache. The Ivy files are there, but the JAR is missing there too.
> > >
> > > markus@midas:~$ ls .ivy2/cache/org.apache.httpcomponents/httpasyncclient/
> > > ivy-4.1.4.xml  ivy-4.1.4.xml.original  ivydata-4.1.4.properties
> > >
> > > I manually downloaded the JAR from [1] and added it to the jars/ directory
> > > in the Ivy cache. It still cannot find the JAR, perhaps the Ivy cache 
> > > needs
> > > some more things than just adding the JAR manually.
> > >
> > > The odd thing is, that i got the URL below FROM the 
> > > ivydata-4.1.4.properties
> > > file in the cache.
> > >
> > > Since Ralf can compile it without problems, it seems to be an issue on my
> > > machine only. So Nutch seems fine, therefore +1.
> > >
> > > Regards,
> > > Markus
> > >
> > > [1]
> > > https://repo1.maven.org/maven2/org/apache/httpcomponents/httpasyncclient/4.1.4/
> > >
> > >
> > > Op zo 28 aug. 2022 om 12:05 schreef Sebastian Nagel
> > > <wastl.na...@googlemail.com.invalid>:
> > >
> > > > Hi Ralf,
> > > >
> > > > > It fetches it parses
> > > >
> > > > So a +1 ?
> > > >
> > > > Best,
> > > > Sebastian
> > > >
> > > > On 8/25/22 05:22, BlackIce wrote:
> > > > > nevermind I made a typo...
> > > > >
> > > > > It fetches it parses
> > > > >
> > > > > On Thu, Aug 25, 2022 at 3:42 AM BlackIce <blackice...@gmail.com> 
> > > > > wrote:
> > > > >>
> > > > >> so far... it doesn't select anything when creating segments:
> > > > >> 0 records selected for fetching, exiting
> > > > >>
> > > > >> On Wed, Aug 24, 2022 at 3:02 PM BlackIce <blackice...@gmail.com> 
> > > > >> wrote:
> > > > >>>
> > > > >>> I have been able to compile under OpenJDK 11
> > > > >>> Have not done anything further so far
> > > > >>> I'm gonna try to get to it this evening
> > > > >>>
> > > > >>> Greetz
> > > > >>> Ralf
> > > > >>>
> > > > >>> On Wed, Aug 24, 2022 at 1:29 PM Markus Jelsma
> > > > >>> <markus.jel...@openindex.io> wrote:
> > > > >>>>
> > > > >>>> Hi,
> > > > >>>>
> > > > >>>> Everything seems fine, the crawler seems fine when trying the 
> > > > >>>> binary
> > > > >>>> distribution. The source won't work because this computer still 
> > > > >>>> cannot
> > > > >>>> compile it. Clearing the local Ivy cache did not do much. This is 
> > > > >>>> the
> > > > known
> > > > >>>> compiler error with the elastic-indexer plugin:
> > > > >>>> compile:
> > > > >>>>     [echo] Compiling plugin: indexer-elastic
> > > > >>>>    [javac] Compiling 3 source files to
> > > > >>>> /home/markus/temp/apache-nutch-1.19/build/indexer-elastic/classes
> > > > >>>>    [javac]
> > > > >>>>
> > > > /home/markus/temp/apache-nutch-1.19/src/plugin/indexer-elastic/src/java/org/apache/nutch/indexwriter/elastic/ElasticIndexWriter.java:39:
> > > > >>>> error: package org.apache.http.impl.nio.client does not exist
> > > > >>>>    [javac] import
> > > > org.apache.http.impl.nio.client.HttpAsyncClientBuilder;
> > > > >>>>    [javac]                                       ^
> > > > >>>>    [javac] 1 error
> > > > >>>>
> > > > >>>>
> > > > >>>> The binary distribution works fine though. I do see a lot of new
> > > > messages
> > > > >>>> when fetching:
> > > > >>>> 2022-08-24 13:21:15,867 INFO o.a.n.n.URLExemptionFilters
> > > > [LocalJobRunner
> > > > >>>> Map Task Executor #0] Found 0 extensions at
> > > > >>>> point:'org.apache.nutch.net.URLExemptionFilter'
> > > > >>>>
> > > > >>>> This is also new at start of each task:
> > > > >>>> SLF4J: Class path contains multiple SLF4J bindings.
> > > > >>>> SLF4J: Found binding in
> > > > >>>>
> > > > [jar:file:/home/markus/temp/apache-nutch-1.19/lib/log4j-slf4j-impl-2.18.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > > > >>>>
> > > > >>>> SLF4J: Found binding in
> > > > >>>>
> > > > [jar:file:/home/markus/temp/apache-nutch-1.19/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > > > >>>>
> > > > >>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > > > >>>> explanation.
> > > > >>>> SLF4J: Actual binding is of type
> > > > >>>> [org.apache.logging.slf4j.Log4jLoggerFactory]
> > > > >>>>
> > > > >>>> And this one at the end of fetcher:
> > > > >>>> log4j:WARN No appenders could be found for logger
> > > > >>>> (org.apache.commons.httpclient.params.DefaultHttpParams).
> > > > >>>> log4j:WARN Please initialize the log4j system properly.
> > > > >>>> log4j:WARN See 
> > > > >>>> http://logging.apache.org/log4j/1.2/faq.html#noconfig
> > > > for
> > > > >>>> more info.
> > > > >>>>
> > > > >>>> I am worried about the indexer-elastic plugin, maybe others have 
> > > > >>>> that
> > > > >>>> problem too? Otherwise everything seems fine.
> > > > >>>>
> > > > >>>> Markus
> > > > >>>>
> > > > >>>> Op ma 22 aug. 2022 om 17:30 schreef Sebastian Nagel <
> > > > sna...@apache.org>:
> > > > >>>>
> > > > >>>>> Hi Folks,
> > > > >>>>>
> > > > >>>>> A first candidate for the Nutch 1.19 release is available at:
> > > > >>>>>
> > > > >>>>>    https://dist.apache.org/repos/dist/dev/nutch/1.19/
> > > > >>>>>
> > > > >>>>> The release candidate is a zip and tar.gz archive of the binary 
> > > > >>>>> and
> > > > >>>>> sources in:
> > > > >>>>>    https://github.com/apache/nutch/tree/release-1.19
> > > > >>>>>
> > > > >>>>> In addition, a staged maven repository is available here:
> > > > >>>>>
> > > > https://repository.apache.org/content/repositories/orgapachenutch-1020
> > > > >>>>>
> > > > >>>>> We addressed 87 issues:
> > > > >>>>>    https://s.apache.org/lf6li
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> Please vote on releasing this package as Apache Nutch 1.19.
> > > > >>>>> The vote is open for the next 72 hours and passes if a majority
> > > > >>>>> of at least three +1 Nutch PMC votes are cast.
> > > > >>>>>
> > > > >>>>> [ ] +1 Release this package as Apache Nutch 1.19.
> > > > >>>>> [ ] -1 Do not release this package becauseā€¦
> > > > >>>>>
> > > > >>>>> Cheers,
> > > > >>>>> Sebastian
> > > > >>>>> (On behalf of the Nutch PMC)
> > > > >>>>>
> > > > >>>>> P.S.
> > > > >>>>> Here is my +1.
> > > > >>>>> - tested most of Nutch tools and run a test crawl on a single-node
> > > > cluster
> > > > >>>>>   running Hadoop 3.3.4, see
> > > > >>>>>   
> > > > >>>>> https://github.com/sebastian-nagel/nutch-test-single-node-cluster/
> > > > )
> > > > >>>>>
> > > >

Reply via email to