Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread Yash Sangani
Hi, I have checked out the latest Nutch trunk(1.10) which has tika 1.7. I followed all the steps mentioned on the Selenium git hub page and also applied the patch. Yet when I am trying to build the Nutch, I get the following errors. [javac]

Re: Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread Yash Sangani
I thought we have to build nutch again after this and thus I tried to build it but then I get the error mentioned in the first email. So I didnt try to crawl it as yet. On Fri, Feb 20, 2015 at 1:26 AM, zhangxin0804 zhangxin0...@gmail.com wrote: I got you. I met the same problem and stuck at

Re: Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread Yash Sangani
Yes I did. Those were because the ivy.xml file had same attributes multiple times. On Fri, Feb 20, 2015 at 12:40 AM, Jiaxin Ye jiaxi...@usc.edu wrote: I guess that means you fail to install the patch. When you install the patch, did you see any fails on ivy.xml? On Friday, February 20,

Re: Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread zhangxin0804
Hi Yash Sangani, Have you successfully installed XVFB on ubuntu? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-installing-Selenium-on-Ubuntu-with-Nutch-trunk-1-10-tp4187576p4187580.html Sent from the Nutch - Dev mailing list archive at Nabble.com.

Re: Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread Yash Sangani
No,after I enter the first command, the console prints some lines Initializing built-in extension Generic Event Extension Initializing built-in extension SHAPE Initializing built-in extension MIT-SHM Initializing built-in extension XInputExtension Initializing built-in extension XTEST

Re: Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread zhangxin0804
I got you. I met the same problem and stuck at here. Did you try to open another terminal to crawl data again? You can have a try to do it and to check whether the Firefox is still pop-up repeats again and again. -- View this message in context:

Re: Problem installing Selenium on Ubuntu with Nutch trunk 1.10

2015-02-20 Thread Jiaxin Ye
I guess that means you fail to install the patch. When you install the patch, did you see any fails on ivy.xml? On Friday, February 20, 2015, Yash Sangani ysang...@usc.edu wrote: Hi, I have checked out the latest Nutch trunk(1.10) which has tika 1.7. I followed all the steps mentioned on the

Re: [ANNOUNCE] New Nutch committer and PMC - Jorge Luis Betancourt Gonzalez

2015-02-20 Thread Lewis John Mcgibbney
DYNAMITE On Thu, Feb 19, 2015 at 11:00 PM, dev-digest-h...@nutch.apache.org wrote:

[jira] [Commented] (NUTCH-1928) Indexing filter of documents by the MIME type

2015-02-20 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328983#comment-14328983 ] Lewis John McGibbney commented on NUTCH-1928: - [~jorgelbg] can you please

Re: Problem Fetching with Selenium Installed

2015-02-20 Thread Qing Liu
Hi Nagarijun, I'm really confused that since we are using virtual framebuffer, why would Firefox pop out? I started Xvfb, then opened firefox in terminal and nothing popped out, but firefox is running. On Thu, Feb 19, 2015 at 10:58 PM, Nagarjun Pola np...@usc.edu wrote: Thank You Mohammed. I

linkdb/current/part-00000/data does not exist

2015-02-20 Thread Shuo Li
Hi, I'm trying to crawl NSF ACADIS with nutch-selenium. I meet a problem *with linkdb/current/part-0/data does not exist. *I checked my directory and my files during crawling, and it appears this file sometimes exist and sometimes disappear. This is quite weird and stranger. Another problem

Build failed in Jenkins: Nutch-nutchgora #1345

2015-02-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-nutchgora/1345/ -- [...truncated 3224 lines...] compile: jar: deps-test: deploy: copy-generated-lib: deploy: [copy] Copying 1 file to

Re: Nutchpy crawled statistics

2015-02-20 Thread Pranshu Kumar
Hi Mohsin, Thanks for the reply. That is exactly what i was asking. Thanks for clarifying. we were also using bin/nutch stats command but i just wanted to be sure if we have to add some more details to the statistics. And sorry Professor about the out of context mail. Will be more specific

Nutchpy crawled statistics

2015-02-20 Thread Pranshu Kumar
I just wanted to know how can we get the crawl statistics ? Is it just using the command line options of nutch or do we need to write a script to generate the stats using nutchpy ?

Re: Nutchpy crawled statistics

2015-02-20 Thread Mohammad Al-Mohsin
Hi Pranshu, I assume you're talking about CS-572 http://sunset.usc.edu/classes/cs572_2015/ class assignment at USC. I think the stats provided by bin/nutch for the crawldb are sufficient (Dr. Mattmann correct me if I'm wrong, please). However, you need to write a script/program to extract the

[jira] [Updated] (NUTCH-1925) Upgrade Tika to version 1.7

2015-02-20 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tyler Palsulich updated NUTCH-1925: --- Attachment: NUTCH-1925.palsulich.p2.v2.patch Patch for 2.x which removes the Nutch version of