Thanks Sebastian. I will try it and if it works, I will merge the fixes you
guys put out.

On Tue, Jan 17, 2023 at 4:02 AM Sebastian Nagel <wastl.na...@googlemail.com>
wrote:

> Hi Kamil,
>
> after some trials I come up with a different solution for the issue with
> the
> "unparseable date", see
>
>    https://github.com/apache/nutch/pull/752
>
> The solution providing a pattern reproducibly fails in certain locales, see
> the comments in
>
>    https://issues.apache.org/jira/browse/NUTCH-2974
>
> Just in case you want to try it.
>
> ~Sebastian
>
> On 11/21/22 10:36, Sebastian Nagel wrote:
> > Hi Kamil,
> >
> > thanks for trying and finding a solution! I've open a JIRA issue to
> track the
> > problem: https://issues.apache.org/jira/browse/NUTCH-2974
> >
> > Thanks!
> >
> > Sebastian
> >
> > On 11/19/22 18:37, Kamil Mroczek wrote:
> >> I've been able to work around this issue by adding "pattern" to touch
> tag
> >> on line 101 in build.xml like so:
> >>
> >> <touch datetime="01/25/1971 2:00 pm" pattern="MM/dd/YYYY hh:mm a">
> >>
> >> On Fri, Nov 18, 2022 at 12:32 PM Kamil Mroczek <kamil@elio.earth>
> wrote:
> >>
> >>> Hello,
> >>>
> >>> When I run the "ant runtime" command I am getting:
> >>>
> >>> /home/hadoop/apache-nutch/build.xml:101: Unparseable date: "01/25/1971
> >>> 2:00 pm"
> >>>
> >>> I've tried different date formats to no avail. There was a similar
> issue
> >>> that was fixed in version 1.19, NUTCH-2512
> >>> <https://issues.apache.org/jira/browse/NUTCH-2512>. I am using Nutch
> >>> 1.19. I am using Java 11. This is running on the AWS EMR master node
> using
> >>> a vanilla AMI running AWS Linux 2.0.20221004.0. Some more debugging
> info
> >>> below.
> >>>
> >>> Kamil
> >>> =============
> >>> [hadoop@ip-172-31-25-62 apache-nutch]$ java -version
> >>> openjdk version "11.0.16.1" 2022-08-12 LTS
> >>> OpenJDK Runtime Environment Corretto-11.0.16.9.1 (build
> 11.0.16.1+9-LTS)
> >>> OpenJDK 64-Bit Server VM Corretto-11.0.16.9.1 (build 11.0.16.1+9-LTS,
> >>> mixed mode)
> >>>
> >>> [hadoop@ip-172-31-25-62 apache-nutch]$ locale
> >>> LANG=en_US.UTF-8
> >>> LC_CTYPE="en_US.UTF-8"
> >>> LC_NUMERIC="en_US.UTF-8"
> >>> LC_TIME="en_US.UTF-8"
> >>> LC_COLLATE="en_US.UTF-8"
> >>> LC_MONETARY="en_US.UTF-8"
> >>> LC_MESSAGES="en_US.UTF-8"
> >>> LC_PAPER="en_US.UTF-8"
> >>> LC_NAME="en_US.UTF-8"
> >>> LC_ADDRESS="en_US.UTF-8"
> >>> LC_TELEPHONE="en_US.UTF-8"
> >>> LC_MEASUREMENT="en_US.UTF-8"
> >>> LC_IDENTIFICATION="en_US.UTF-8"
> >>> LC_ALL=
> >>>
> >>> ------- Ant diagnostics report -------
> >>> Apache Ant(TM) version 1.9.2 compiled on November 13 2017
> >>>
> >>> -------------------------------------------
> >>>   Implementation Version
> >>> -------------------------------------------
> >>> core tasks     : 1.9.2 in file:/usr/share/java/ant/ant.jar
> >>>
> >>> -------------------------------------------
> >>>   ANT PROPERTIES
> >>> -------------------------------------------
> >>> ant.version: Apache Ant(TM) version 1.9.2 compiled on November 13 2017
> >>> ant.java.version: 1.8
> >>> Is this the Apache Harmony VM? no
> >>> Is this the Kaffe VM? no
> >>> Is this gij/gcj? no
> >>> ant.core.lib: /usr/share/java/ant/ant.jar
> >>> ant.home: /usr/share/ant
> >>>
> >>> -------------------------------------------
> >>>   ANT_HOME/lib jar listing
> >>> -------------------------------------------
> >>> ant.home: /usr/share/ant
> >>> ant-bootstrap.jar (20919 bytes)
> >>> ant-launcher.jar (19038 bytes)
> >>> ant.jar (1998416 bytes)
> >>>
> >>> -------------------------------------------
> >>>   USER_HOME/.ant/lib jar listing
> >>> -------------------------------------------
> >>> user.home: /home/hadoop
> >>> No such directory.
> >>>
> >>> -------------------------------------------
> >>>   Tasks availability
> >>> -------------------------------------------
> >>> junitreport : Not Available (the implementation class is not present)
> >>> sshsession : Not Available (the implementation class is not present)
> >>> sshexec : Not Available (the implementation class is not present)
> >>> telnet : Not Available (the implementation class is not present)
> >>> scp : Not Available (the implementation class is not present)
> >>> antlr : Not Available (the implementation class is not present)
> >>> netrexxc : Not Available (the implementation class is not present)
> >>> ftp : Not Available (the implementation class is not present)
> >>> rexec : Not Available (the implementation class is not present)
> >>> sound : Not Available (the implementation class is not present)
> >>> image : Not Available (the implementation class is not present)
> >>> junit : Not Available (the implementation class is not present)
> >>> jdepend : Not Available (the implementation class is not present)
> >>> splash : Not Available (the implementation class is not present)
> >>> A task being missing/unavailable should only matter if you are trying
> to
> >>> use it
> >>>
> >>> -------------------------------------------
> >>>   org.apache.env.Which diagnostics
> >>> -------------------------------------------
> >>> Not available.
> >>> Download it at http://xml.apache.org/commons/
> >>>
> >>> -------------------------------------------
> >>>   XML Parser information
> >>> -------------------------------------------
> >>> XML Parser : org.apache.xerces.jaxp.SAXParserImpl
> >>> XML Parser Location: file:/usr/share/java/xerces-j2.jar
> >>> Namespace-aware parser :
> org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser
> >>> Namespace-aware parser Location: file:/usr/share/java/xerces-j2.jar
> >>>
> >>> -------------------------------------------
> >>>   XSLT Processor information
> >>> -------------------------------------------
> >>> XSLT Processor :
> >>> com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl
> >>> XSLT Processor Location: unknown
> >>>
> >>> -------------------------------------------
> >>>   System properties
> >>> -------------------------------------------
> >>> java.runtime.name : OpenJDK Runtime Environment
> >>> java.vm.version : 11.0.16.1+9-LTS
> >>> sun.boot.library.path : /usr/lib/jvm/java-11-amazon-corretto.x86_64/lib
> >>> ant.library.dir : /usr/share/ant/lib
> >>> java.vm.vendor : Amazon.com Inc.
> >>> java.vendor.url : https://aws.amazon.com/corretto/
> >>> path.separator : :
> >>> java.vm.name : OpenJDK 64-Bit Server VM
> >>> sun.os.patch.level : unknown
> >>> user.country : US
> >>> sun.java.launcher : SUN_STANDARD
> >>> java.vm.specification.name : Java Virtual Machine Specification
> >>> user.dir : /home/hadoop/apache-nutch
> >>> java.vm.compressedOopsMode : Zero based
> >>> java.runtime.version : 11.0.16.1+9-LTS
> >>> java.awt.graphicsenv : sun.awt.X11GraphicsEnvironment
> >>> os.arch : amd64
> >>> java.io.tmpdir : /tmp
> >>> line.separator :
> >>>
> >>> java.vm.specification.vendor : Oracle Corporation
> >>> os.name : Linux
> >>> ant.home : /usr/share/ant
> >>> sun.jnu.encoding : UTF-8
> >>> java.library.path :
> /usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib
> >>> jdk.debug : release
> >>> java.class.version : 55.0
> >>> java.specification.name : Java Platform API Specification
> >>> sun.management.compiler : HotSpot 64-Bit Tiered Compilers
> >>> os.version : 4.14.294-220.533.amzn2.x86_64
> >>> user.home : /home/hadoop
> >>> user.timezone :
> >>> java.awt.printerjob : sun.print.PSPrinterJob
> >>> file.encoding : UTF-8
> >>> java.specification.version : 11
> >>> user.name : hadoop
> >>> java.class.path :
> >>>
> /usr/share/java/ant.jar:/usr/share/java/ant-launcher.jar:/usr/share/java/jaxp_parser_impl.jar:/usr/share/java/xml-commons-apis.jar:/usr/share/ant/lib/ant-bootstrap.jar:/usr/share/ant/lib/ant-launcher.jar:/usr/share/ant/lib/ant.jar
> >>> java.vm.specification.version : 11
> >>> sun.arch.data.model : 64
> >>> sun.java.command : org.apache.tools.ant.launch.Launcher -cp
> -diagnostics
> >>> java.home : /usr/lib/jvm/java-11-amazon-corretto.x86_64
> >>> user.language : en
> >>> java.specification.vendor : Oracle Corporation
> >>> awt.toolkit : sun.awt.X11.XToolkit
> >>> java.vm.info : mixed mode
> >>> java.version : 11.0.16.1
> >>> java.vendor : Amazon.com Inc.
> >>> file.separator : /
> >>> java.version.date : 2022-08-12
> >>> java.vendor.url.bug : https://github.com/corretto/corretto-11/issues/
> >>> sun.io.unicode.encoding : UnicodeLittle
> >>> sun.cpu.endian : little
> >>> java.vendor.version : Corretto-11.0.16.9.1
> >>> sun.cpu.isalist :
> >>>
> >>> -------------------------------------------
> >>>   Temp dir
> >>> -------------------------------------------
> >>> Temp dir is /tmp
> >>> Temp dir is writeable
> >>> Temp dir alignment with system clock is 11 ms
> >>>
> >>> -------------------------------------------
> >>>   Locale information
> >>> -------------------------------------------
> >>> Timezone Coordinated Universal Time offset=0
> >>>
> >>> -------------------------------------------
> >>>   Proxy information
> >>> -------------------------------------------
> >>> Java1.5+ proxy settings:
> >>> Direct connection
> >>>
> >>
>

Reply via email to