[jira] [Commented] (NUTCH-1925) Upgrade Tika to version 1.7

2015-02-18 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325756#comment-14325756 ] Sebastian Nagel commented on NUTCH-1925: Hi [~tpalsulich], the patch breaks the

Re: Selenium Grid 2 Installation Tutorial for Mac

2015-02-18 Thread Jiaxin Ye
Hey Nagarjun, Thanks for the contribution, I get a question, though. How are you going to use Selenium Grid without a nutch plugin? What I am thinking is that we need to install https://github.com/momer/nutch-selenium-grid-plugin to let Selenium Gird work on Nutch, isn't it? Best, Jiaxin On

Re: HttpPostAuthentication Cannot Find Authentication Form

2015-02-18 Thread Lewis John Mcgibbney
Hi On Tue, Feb 17, 2015 at 12:25 PM, dev-digest-h...@nutch.apache.org wrote: No form element found with 'id' = login, trying 'name'. No form element found with 'name' = login The form element for id is not 'login', it is 'username'. The form element for the password is 'password' I was

Re: Tesseract OCR and GDAL in Tika plugin for Nutch?

2015-02-18 Thread Tyler Palsulich
If you have gdal and Tesseract installed locally, they will be run against (eligible) parsed files in Tika. There shouldn't be any required configuration on the Nutch side. Please see http://wiki.apache.org/tika/TikaOCR and http://wiki.apache.org/tika/TikaGDAL for how to install/run them. Hope

Re: Selenium Grid 2 Installation Tutorial for Mac

2015-02-18 Thread Nagarjun Pola
Jiaxin, Yes, but unfortunately selenium grid plugin is supported for Nutch 2.2.1. I currently installed selenium grid independently and could run some tests but still working on using with Nutch 1.10. Any pointers would be helpful. Best, Nagarjun Pola On Wed, Feb 18, 2015 at 1:50

[Maven Failed]

2015-02-18 Thread Jiaxin Ye
Hi All, I restarted my mac and found when I type mvn in the command line I got the following error. Anyone know why? local mvn log4j:WARN No appenders could be found for logger (org.apache.maven.cli.logging.impl.UnsupportedSlf4jBindingConfiguration). log4j:WARN Please initialize the log4j

Re: Tesseract OCR and GDAL in Tika plugin for Nutch?

2015-02-18 Thread Jiaxin Ye
Got it. Thanks! On Wed, Feb 18, 2015 at 3:07 PM, Tyler Palsulich tpalsul...@gmail.com wrote: Please see NUTCH-1925 for the current status of upgrading Tika to version 1.7. The current released version of Nutch uses Tika 1.6. You can try applying the patch there (v2 for 1.x versions) or

Re: HttpPostAuthentication Cannot Find Authentication Form

2015-02-18 Thread Mohammad Al-Mohsin
Hi Lewis, According to the documentation (in the file httpclient-auth.xml.template): loginFormId - the form id=$formId attribute value(or the 'name' attribute if no form is referenced by 'id' attribute) So I'm pretty sure I got it right as the page html source contains: form accept-charset=UTF-8

Re: Nutch-Selenium in Nutch 1.10

2015-02-18 Thread Jaydeep Bagrecha
thanks Jiaxin! I again repeated the entire installation procedure and I think i have installed it correctly.(it said BUILD SUCCESSFUL after ant runtime command and has selenium jar files in runtime/local/lib folder) When i started crawling the mozilla browser popped 2 times,but when i saw

Re: Tesseract OCR and GDAL in Tika plugin for Nutch?

2015-02-18 Thread Jiaxin Ye
Hi Tyler, Is there anyway to test if newest version of tika is working on Nutch or not? On Wednesday, February 18, 2015, Tyler Palsulich tpalsul...@gmail.com wrote: If you have gdal and Tesseract installed locally, they will be run against (eligible) parsed files in Tika. There shouldn't be

Re: Tesseract OCR and GDAL in Tika plugin for Nutch?

2015-02-18 Thread Mattmann, Chris A (3980)
Parser checker Sent from my iPhone On Feb 18, 2015, at 3:03 PM, Jiaxin Ye jiaxi...@usc.edumailto:jiaxi...@usc.edu wrote: Hi Tyler, Is there anyway to test if newest version of tika is working on Nutch or not? On Wednesday, February 18, 2015, Tyler Palsulich

Re: Tesseract OCR and GDAL in Tika plugin for Nutch?

2015-02-18 Thread Tyler Palsulich
Please see NUTCH-1925 for the current status of upgrading Tika to version 1.7. The current released version of Nutch uses Tika 1.6. You can try applying the patch there (v2 for 1.x versions) or checking out trunk. Tyler On Wed, Feb 18, 2015 at 6:00 PM, Jiaxin Ye jiaxi...@usc.edu wrote: Hi

Build failed in Jenkins: Nutch-nutchgora #1343

2015-02-18 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-nutchgora/1343/ -- [...truncated 3219 lines...] compile: jar: deps-test: deploy: copy-generated-lib: deploy: [copy] Copying 1 file to