Just not able to get it working...
At first I got selenium timeout exception even
with libselenium.page.load.delay set. The solution was to increase the
value of page.load.delay which was default of 3.

Then I stucked with the output of Selenium which shows "You need to enable
JavaScript".

Am running the nutch with command:
./bin/nutch parsechecker -Dplugin.includes='protocol-selenium|parse-tika' \
 -Dselenium.enable.headless=true \
 -Dlibselenium.page.load.delay=120 \
 -Dpage.load.delay=120 \
 -followRedirects -dumpText https://metais.slovensko.sk

Went through the source code of libselenium and selenium protocol plugins
with no success.

What else to try to get such page crawled?

Peter

Reply via email to