Re: Newbie trouble - Hbase class not found

2016-05-16 Thread Lewis John Mcgibbney
Hi Diego,

The PR at https://github.com/apache/nutch/pull/111 will solve your issue.
Thanks

On Mon, May 16, 2016 at 11:40 AM, <user-digest-h...@nutch.apache.org> wrote:

>
> From: diego gullo <diegogu...@gmail.com>
> To: user@nutch.apache.org
> Cc:
> Date: Sun, 15 May 2016 20:04:05 +0100
> Subject: Re: Newbie trouble - Hbase class not found
> Hi Lewis
>
> I have changed the build for the docker containers and in the weekend sent
> the PR for the logs folder. The original problem I had is still persistent.
>
> To reproduce
>
>
>1. Check out https://github.com/bizmate/nutch
>2. run *docker-compose up -d* - this will pull the docker image based on
>the Official docker file and mount it with the configurations suggested
> in
>the documentation available on the nutch site. i.e. Ivy, gora and
>nutch-site configs all available at
>https://github.com/bizmate/nutch/tree/master/docker/. This includes the
>suggestion from your previous email.
>https://github.com/bizmate/nutch/blob/master/docker/nutch/ivy.xml#L117
>3. access the container docker exec -it nutch bash
>4. su hdbase
>5. Run inject, still says class not found
>
>
> hduser@458c70ec85a2:/opt/nutch$ bin/nutch inject urls/seed.txt
> InjectorJob: starting at 2016-05-15 18:40:31
> InjectorJob: Injecting urlDir: urls/seed.txt
> Exception in thread "main" *java.lang.NoClassDefFoundError:
> org/apache/hadoop/hbase/HBaseConfiguration*
> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:114)
> at
>
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
> at
>
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
> at
>
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
> at
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
> at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:267)
> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:290)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:299)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.HBaseConfiguration
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 10 more
>
> this is also despite setting HBASE_HOME and HADOOP_CLASSPATH  as suggested
> here -
>
> http://stackoverflow.com/questions/26364057/exception-in-thread-main-java-lang-noclassdeffounderror-org-apache-hadoop-hba
>
>


Re: Newbie trouble - Hbase class not found

2016-05-15 Thread diego gullo
Hi Lewis

I have changed the build for the docker containers and in the weekend sent
the PR for the logs folder. The original problem I had is still persistent.

To reproduce


   1. Check out https://github.com/bizmate/nutch
   2. run *docker-compose up -d* - this will pull the docker image based on
   the Official docker file and mount it with the configurations suggested in
   the documentation available on the nutch site. i.e. Ivy, gora and
   nutch-site configs all available at
   https://github.com/bizmate/nutch/tree/master/docker/. This includes the
   suggestion from your previous email.
   https://github.com/bizmate/nutch/blob/master/docker/nutch/ivy.xml#L117
   3. access the container docker exec -it nutch bash
   4. su hdbase
   5. Run inject, still says class not found


hduser@458c70ec85a2:/opt/nutch$ bin/nutch inject urls/seed.txt
InjectorJob: starting at 2016-05-15 18:40:31
InjectorJob: Injecting urlDir: urls/seed.txt
Exception in thread "main" *java.lang.NoClassDefFoundError:
org/apache/hadoop/hbase/HBaseConfiguration*
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:114)
at
org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
at
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at
org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:267)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:290)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:299)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 10 more

this is also despite setting HBASE_HOME and HADOOP_CLASSPATH  as suggested
here -
http://stackoverflow.com/questions/26364057/exception-in-thread-main-java-lang-noclassdeffounderror-org-apache-hadoop-hba



On 10 May 2016 at 07:28, diego gullo  wrote:

> Hi Lewis
>
> thanks a lot for the reply. Regarding the ivy config I dont think this is
> the problem.
>
> I have the hbase config enable here
>
> https://github.com/bizmate/nutch/blob/master/docker/nutch/ivy.xml#L115
>
> This file is mounted in the image through docker compose.
>
> https://github.com/bizmate/nutch/blob/master/docker-compose.yml#L14
>
> The difference with the file you pointed to is the revision. Unsure if it
> makes a difference.
>
> However, you suggested to use some specific Dockerfiles. I am not 100%
> sure if these have already been used to publish images on Docker hub.
> Infact I would not mind to set this up although if the images are already
> pushed to docker hub I would rather not spend time on something that is
> already there.
>
> These are all the nutch images i see already available on the hub.
>
>
> https://hub.docker.com/search/?isAutomated=0=0=1=0=nutch=0
>
> If the official image is not there I would be happy to contribute and ask
> docker for the 'nutch' name space and set the automated build.
>
> If interested pls let me know. I can IRC if preferred.
>
>
>
> On 9 May 2016 at 16:12, Lewis John Mcgibbney 
> wrote:
>
>> Hi Diego,
>>
>> On Mon, May 9, 2016 at 2:32 AM, 
>> wrote:
>>
>> >
>> > From: diego gullo 
>> > To: user@nutch.apache.org
>> > Cc:
>> > Date: Sat, 7 May 2016 09:41:00 +0100
>> > Subject: Newbie trouble - Hbase class not found
>> > I am trying Nutch for the first time. I created an automated docker
>> setup
>> > to load
>> > Nutch 2 + Hbase (i had tried cassandra but could not get it to work so i
>> > thought i start with Hbase to give it a try)
>> >
>>
>> I would suggest you to use the official Nutch containers which can be
>> found
>> at
>> https://github.com/apache/nutch/tree/2.x/docker/hbase
>>
>>
>> >
>> > The project is available at https://github.com/bizmate/nutch
>> >
>>
>> Nice, thanks for posting
>>
>>
>> > and with docker compose you can start the containers with a running
>> > instance of Nutch  exposed on 8899 and Hbase.
>> >
>>
>> Cool.
>>
>>
>> >
>> > in gora.properties i already enabled hbase
>> >
>> > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
>> >
>> >
>> > But i get Hbase class not found error when I run this command.
>> >
>> > root@87b87f55835e:/opt/nutch# bin/nutch inject urls.txt
>> >
>> > InjectorJob: 

Re: Newbie trouble - Hbase class not found

2016-05-10 Thread diego gullo
Hi Lewis

thanks a lot for the reply. Regarding the ivy config I dont think this is
the problem.

I have the hbase config enable here

https://github.com/bizmate/nutch/blob/master/docker/nutch/ivy.xml#L115

This file is mounted in the image through docker compose.

https://github.com/bizmate/nutch/blob/master/docker-compose.yml#L14

The difference with the file you pointed to is the revision. Unsure if it
makes a difference.

However, you suggested to use some specific Dockerfiles. I am not 100% sure
if these have already been used to publish images on Docker hub. Infact I
would not mind to set this up although if the images are already pushed to
docker hub I would rather not spend time on something that is already
there.

These are all the nutch images i see already available on the hub.

https://hub.docker.com/search/?isAutomated=0=0=1=0=nutch=0

If the official image is not there I would be happy to contribute and ask
docker for the 'nutch' name space and set the automated build.

If interested pls let me know. I can IRC if preferred.



On 9 May 2016 at 16:12, Lewis John Mcgibbney 
wrote:

> Hi Diego,
>
> On Mon, May 9, 2016 at 2:32 AM,  wrote:
>
> >
> > From: diego gullo 
> > To: user@nutch.apache.org
> > Cc:
> > Date: Sat, 7 May 2016 09:41:00 +0100
> > Subject: Newbie trouble - Hbase class not found
> > I am trying Nutch for the first time. I created an automated docker setup
> > to load
> > Nutch 2 + Hbase (i had tried cassandra but could not get it to work so i
> > thought i start with Hbase to give it a try)
> >
>
> I would suggest you to use the official Nutch containers which can be found
> at
> https://github.com/apache/nutch/tree/2.x/docker/hbase
>
>
> >
> > The project is available at https://github.com/bizmate/nutch
> >
>
> Nice, thanks for posting
>
>
> > and with docker compose you can start the containers with a running
> > instance of Nutch  exposed on 8899 and Hbase.
> >
>
> Cool.
>
>
> >
> > in gora.properties i already enabled hbase
> >
> > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> >
> >
> > But i get Hbase class not found error when I run this command.
> >
> > root@87b87f55835e:/opt/nutch# bin/nutch inject urls.txt
> >
> > InjectorJob: starting at 2016-05-07 08:37:49
> >
> > InjectorJob: Injecting urlDir: urls.txt
> >
> > *InjectorJob: java.lang.ClassNotFoundException:
> > org.apache.gora.hbase.store.HBaseStore*
> >
> >
> [snip]
>
>
> >
> > Suggestions?
> >
> >
> Yes, you've not enabled the gora-hbase dependency download from within
> ivy/ivy.xml
> https://github.com/apache/nutch/blob/2.x/ivy/ivy.xml#L114-L117
>
> Please refer to the tutorial for further advice
> http://wiki.apache.org/nutch/Nutch2Tutorial
> Thanks
>
> --
> *Lewis*
>



-- 
www.bizmate.biz


Re: Newbie trouble - Hbase class not found

2016-05-09 Thread Lewis John Mcgibbney
Hi Diego,

On Mon, May 9, 2016 at 2:32 AM,  wrote:

>
> From: diego gullo 
> To: user@nutch.apache.org
> Cc:
> Date: Sat, 7 May 2016 09:41:00 +0100
> Subject: Newbie trouble - Hbase class not found
> I am trying Nutch for the first time. I created an automated docker setup
> to load
> Nutch 2 + Hbase (i had tried cassandra but could not get it to work so i
> thought i start with Hbase to give it a try)
>

I would suggest you to use the official Nutch containers which can be found
at
https://github.com/apache/nutch/tree/2.x/docker/hbase


>
> The project is available at https://github.com/bizmate/nutch
>

Nice, thanks for posting


> and with docker compose you can start the containers with a running
> instance of Nutch  exposed on 8899 and Hbase.
>

Cool.


>
> in gora.properties i already enabled hbase
>
> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
>
>
> But i get Hbase class not found error when I run this command.
>
> root@87b87f55835e:/opt/nutch# bin/nutch inject urls.txt
>
> InjectorJob: starting at 2016-05-07 08:37:49
>
> InjectorJob: Injecting urlDir: urls.txt
>
> *InjectorJob: java.lang.ClassNotFoundException:
> org.apache.gora.hbase.store.HBaseStore*
>
>
[snip]


>
> Suggestions?
>
>
Yes, you've not enabled the gora-hbase dependency download from within
ivy/ivy.xml
https://github.com/apache/nutch/blob/2.x/ivy/ivy.xml#L114-L117

Please refer to the tutorial for further advice
http://wiki.apache.org/nutch/Nutch2Tutorial
Thanks

-- 
*Lewis*