d_k:

I think I had space when I tried and that could be the reason why it didn't
work! Nutch 2.2.1 is built successfully now. Thank you!!

Now, I got into a new issue  - I tried to run my first crawl on the target
server wherever I had the firewall limitation preventing access to
internet. I started getting timedout errors. Looks like firewall is
blocking my nutch crawler to crawl any site. Please suggest what can be
done? For ant to compile, I could scp .ivy2 folder to the tarrget server
and compiled it. I am totally not sure how I go about getting nutch crawl
website in such a firewall restricted environment?

Thanks!


On Tue, Feb 11, 2014 at 2:30 PM, d_k <[email protected]> wrote:

> There shouldn't be a space between -D and ivy.cache.dir, that is
> "-Divy.cache.dir=..." and not "-D ivy.cache.dir=..." and I trust you
> changed "/path/to/extraced/cache" to the correct path?
>
>
> On Tue, Feb 11, 2014 at 7:54 PM, A Laxmi <[email protected]> wrote:
>
> > d_k & Tejas:
> >
> > Yay!! It worked!! I had to extract the uploaded ivy2 folder to
> /root/.ivy2
> > and had to use "ant runtime". It ran very well and BUILD was Successful!
> >
> > On a side note, I initially tried to put ivy2 folder in a different path
> > and used this parameter "-D ivy.cache.dir=/path/to/extraced/cache". which
> > didn't work, not sure why.
> >
> > Thanks so much!!!
> >
> >
> >
> >
> > On Tue, Feb 11, 2014 at 11:52 AM, d_k <[email protected]> wrote:
> >
> > > This should work as is. Copy them to the target server and try to
> > compile.
> > >
> > >
> > > On Tue, Feb 11, 2014 at 6:18 PM, A Laxmi <[email protected]>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > I compiled nutch in a linux server connected to internet like Tejas
> > > > suggested and found the .iv2 folder. However, there are some files in
> > > that
> > > > folder with filenames that has its own hostname as part of the
> > filename.
> > > I
> > > > am wondering how I can scp this .iv2 folder to other server which
> has a
> > > > different hostname? Can I just manually edit those filenames to match
> > > other
> > > > server hostname? Please advise
> > > >
> > > >
> > > > On Sat, Feb 8, 2014 at 9:32 AM, d_k <[email protected]> wrote:
> > > >
> > > > > Tejas Patil is right, you should copy over the .ivy2 folder and it
> > will
> > > > > work.
> > > > >
> > > > > You can extract it to some other location and run ant with the
> > > parameter
> > > > > "-D
> > > > > ivy.cache.dir=/path/to/extraced/cache".
> > > > >
> > > > > In order to use the eclipse project behind a firewall you can
> either
> > > run
> > > > > 'ant eclipse' and copy over the .project and .classpath files or
> > > download
> > > > > the ant-eclipse-1.0.bin.tar.bz2 file, the default url is [0] and
> then
> > > > > either edit the ant-eclipse-download target in build.xml to a web
> > > server
> > > > > serving the copied tar over http or change the build.xml
> > > > > ant-eclipse-download target from a get task to something along the
> > > lines
> > > > > of:
> > > > >
> > > > > <copy file="/path/to/local/ant-eclipse-1.0.bin.tar.bz2"
> > > > > todir="${build.dir}" />
> > > > >
> > > > > [0]
> > > > >
> > > > >
> > > >
> > >
> >
> http://downloads.sourceforge.net/project/ant-eclipse/ant-eclipse/1.0/ant-eclipse-1.0.bin.tar.bz2
> > > > >
> > > > >
> > > > > On Sat, Feb 8, 2014 at 11:28 AM, Tejas Patil <
> > [email protected]
> > > > > >wrote:
> > > > >
> > > > > > This has to do more with ant and nothing about nutch. Here is a
> > wild
> > > > > idea:
> > > > > >
> > > > > > Grab a linux box without any internet restrictions, download
> nutch
> > > over
> > > > > it
> > > > > > and build it. In the user home, there would a hidden directory
> > > ".ivy2"
> > > > > > which is a local ivy cache. Create a tarball of the same and scp
> it
> > > > over
> > > > > > your work machine, extract it in home directory and then run
> nutch
> > > > build.
> > > > > >
> > > > > > PS: I have never done this for ivy but for maven and it had
> worked.
> > > > > >
> > > > > > ~tejas
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 7, 2014 at 2:18 PM, A Laxmi <[email protected]>
> > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am having issues building Nutch 2.2.1 behind my company
> > firewall.
> > > > My
> > > > > > > build gets stuck here:
> > > > > > >
> > > > > > > [ivy:resolve] :: loading settings :: file =
> > > > > > > ~/nutchtest/nutch/ivy/ivysettings.xml
> > > > > > >
> > > > > > > When I contacted the hosting admin, they said - "Ant is trying
> to
> > > > > > download
> > > > > > > files from internet and it will have problems with our
> firewalls.
> > > You
> > > > > > will
> > > > > > > either have to download the files yourself and then scp/sftp
> them
> > > to
> > > > > the
> > > > > > > machine. Unfortunately we don't have an http proxy."
> > > > > > >
> > > > > > >
> > > > > > > From further digging, I could see Ant is trying to access this
> > link
> > > > > > > http://ant.apache.org/ivy/. Could anyone please advise what I
> > > should
> > > > > do
> > > > > > to
> > > > > > > make Ant compile Nutch without accessing the internet? I can
> > > download
> > > > > > > required files from http://ant.apache.org/ivy/ and scp/sftp to
> > the
> > > > > > server
> > > > > > > but I am not sure what files to download and where to put them?
> > > > > > >
> > > > > > > Thanks for your help!!
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to