Sure. I will do it once I confirm it works...
On Thursday, February 12, 2015, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:
> This is great, Jiaxin, can you please make a wiki page on the Nutch
> wiki that has this information?
>
> ++
This is great, Jiaxin, can you please make a wiki page on the Nutch
wiki that has this information?
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pas
Hi Li, Shuo. You are so right. I finished installing and successfully run
the butch with selenium and Firefox. I have a question though, does your
Firefox plug out for always all the urls we crawled?
Hi Prof Mattmann. I think here is the way we install selenium on MAC with
OS higher than 10.6 I th
[
https://issues.apache.org/jira/browse/NUTCH-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319277#comment-14319277
]
Leo Ye commented on NUTCH-1939:
---
Good to see we fixed it. Thank you, [~wastl-nagel]
> Fetch
Hi,
Please send a message to dev-subscr...@nutch.apache.org to subscribe to the
list.
Tyler
On Feb 12, 2015 6:54 PM, "Poojan Jhaveri" wrote:
>
>
Cool. Issue resolved now.
Thanks Sebastian !
On Wed, Feb 11, 2015 at 12:21 PM, Sebastian Nagel <
wastl.na...@googlemail.com> wrote:
> Hi,
>
> the jetty-client-6.1.22.jar
> is a dependency needed only for testing.
> Consequently, it's placed in
> build/test/lib/
> but only if you run the tests,
[
https://issues.apache.org/jira/browse/NUTCH-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319064#comment-14319064
]
Markus Jelsma commented on NUTCH-1925:
--
Ja, ill check it in tomorrow. Any comments on
I think I have possibly finished installing.
What you need to do:
0. git status and checkout what you have modified.
1. patch -p0 < YOUR_PATCH_FILE
2. ant clean jar
3. ant runtime
Will try crawling using selenium later on. Hope this helped. >_<
On Thu, Feb 12, 2015 at 9:20 AM, Mattmann, Chris A
Hi,
For part one is your depth parameter same when you re crawl?
part 2:-To get an idea about the fetched and un fetched url nutch provides
a tool to generate stats for the crawl. You can check out the stats after
each crawl and identify which urls being fetched and un fetched.
Regards,
Avi Sana
Hi Everyone,
I started to use Nutch 1.10 for my homework and I see that every time I
perform a crawl using the same configuration and same seed urls I get a
different number of fetched urls. This occurs even when the old crawl data
is deleted.
This way I would not be able to identify which URLs h
[
https://issues.apache.org/jira/browse/NUTCH-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318651#comment-14318651
]
Lewis John McGibbney commented on NUTCH-1925:
-
+1 [~markus17] please commit if
Yes I believe you need to install X11 - why don't you try and report back what
you find thanks.
Sent from my iPhone
On Feb 12, 2015, at 8:28 AM, Jiaxin Ye
mailto:jiaxi...@usc.edu>> wrote:
Hi professor, but can we use Selenium on Mac?
On Thursday, February 12, 2015, Mattmann, Chris A (3980)
m
Hi professor, but can we use Selenium on Mac?
On Thursday, February 12, 2015, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:
> You need Selenium Jiaxin, in order to crawl dynamic pages in the
> polar dataset you have been assigned in my CSCI 572 search engines class.
>
> The ins
[
https://issues.apache.org/jira/browse/NUTCH-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1925:
-
Attachment: NUTCH-1925-2x.patch
Patch for 2.x, it seems to be working. Please confirm.
> Upgrade
[
https://issues.apache.org/jira/browse/NUTCH-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318308#comment-14318308
]
Chris A. Mattmann commented on NUTCH-1942:
--
Julien can you tell me more about cra
Julien Nioche created NUTCH-1942:
Summary: Remove TopLevelDomain
Key: NUTCH-1942
URL: https://issues.apache.org/jira/browse/NUTCH-1942
Project: Nutch
Issue Type: Task
Reporter: J
You need Selenium Jiaxin, in order to crawl dynamic pages in the
polar dataset you have been assigned in my CSCI 572 search engines class.
The instructions for integrating Selenium with Nutch 1.10-trunk
are here:
https://issues.apache.org/jira/browse/NUTCH-1933
Cheers,
Chris
[
https://issues.apache.org/jira/browse/NUTCH-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318099#comment-14318099
]
Hudson commented on NUTCH-1939:
---
SUCCESS: Integrated in Nutch-trunk #2972 (See
[https://bui
[
https://issues.apache.org/jira/browse/NUTCH-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel resolved NUTCH-1939.
Resolution: Fixed
Committed to trunk, v1659227. Thanks, [~leoyey]!
> Fetcher fails to follo
[
https://issues.apache.org/jira/browse/NUTCH-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317873#comment-14317873
]
Hudson commented on NUTCH-1323:
---
SUCCESS: Integrated in Nutch-trunk #2971 (See
[https://bui
[
https://issues.apache.org/jira/browse/NUTCH-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317874#comment-14317874
]
Hudson commented on NUTCH-1913:
---
SUCCESS: Integrated in Nutch-trunk #2971 (See
[https://bui
[
https://issues.apache.org/jira/browse/NUTCH-1724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1724:
-
Attachment: NUTCH-1724-trunk.patch
Modified to adhere to Lewis' changes. Will commit shortly unles
[
https://issues.apache.org/jira/browse/NUTCH-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317829#comment-14317829
]
Markus Jelsma commented on NUTCH-1730:
--
Anything to add to this modificiation?
> Sco
[
https://issues.apache.org/jira/browse/NUTCH-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317826#comment-14317826
]
Markus Jelsma commented on NUTCH-1921:
--
Anything to add to this optional settings?
>
[
https://issues.apache.org/jira/browse/NUTCH-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317828#comment-14317828
]
Markus Jelsma commented on NUTCH-1684:
--
Anything to add to this? I think this can go
Well, good choice. I am thinking changing to ubuntu now. The thing is why
do we need Selenium anyway? Just easier to perform crawling?
On Thu, Feb 12, 2015 at 12:25 AM, Shuo Li wrote:
> Interestingly, I'm a mac user but I don't want to screw my laptop so I'm
> using vagrant with Ubuntu Trusty. I
[
https://issues.apache.org/jira/browse/NUTCH-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-1913.
--
Resolution: Fixed
> LinkDB to implement db.ignore.external.links
> -
[
https://issues.apache.org/jira/browse/NUTCH-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317816#comment-14317816
]
Markus Jelsma commented on NUTCH-1913:
--
Thanks Sebastian, committed to trunk in revis
[
https://issues.apache.org/jira/browse/NUTCH-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317815#comment-14317815
]
Markus Jelsma commented on NUTCH-1925:
--
Committed to trunk in revision 1659168.
>
[
https://issues.apache.org/jira/browse/NUTCH-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-1323.
--
Resolution: Fixed
Just in time for 1.10, Committed to trunk in revision 1659167.
> AjaxNormali
[
https://issues.apache.org/jira/browse/NUTCH-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1323:
-
Fix Version/s: (was: 1.11)
1.10
> AjaxNormalizer
> --
>
>
Interestingly, I'm a mac user but I don't want to screw my laptop so I'm
using vagrant with Ubuntu Trusty. It doesn't have GUI but Xvfb can still be
installed properly. The issue would be I don't know how to integrate
Selenium with Nutch 1.10.
On Thu, Feb 12, 2015 at 12:04 AM, Jiaxin Ye wrote:
>
Hi all,
Anyone here knows where to find the setup tutorial for Selenium on Mac ?? I
find it difficult to install Xvfb on mac.
Best,
Jiaxin
On Tue, Feb 10, 2015 at 9:42 PM, Sapnashri Suresh wrote:
> Hi Shuo Li,
>
> We were facing a similar issue. Prof. Mattman suggested we look into this
> patc
34 matches
Mail list logo