Spark 1.1.0 hbase_inputformat.py not work

2014-09-23 Thread Gilberto Lira
Hi,

i'm trying to run hbase_inputformat.py example but i'm not getting.

this is the error:

Traceback (most recent call last):
  File "/root/spark/examples/src/main/python/hbase_inputformat.py", line
70, in 
conf=conf)
  File "/root/spark/python/pyspark/context.py", line 471, in newAPIHadoopRDD
jconf, batchSize)
  File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
line 538, in __call__
  File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line
300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.io.ImmutableBytesWritable
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)

can anyone help me?


Re: Spark 1.1.0 hbase_inputformat.py not work

2014-09-23 Thread freedafeng
I don't know if it's relevant, but I had to compile spark for my specific
hbase and hadoop version to make that hbase_inputformat.py work.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-0-hbase-inputformat-py-not-work-tp14905p14912.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark 1.1.0 hbase_inputformat.py not work

2014-09-30 Thread Kan Zhang
I somehow missed this. Do you still have problem? You probably didn't
specify the correct spark-examples jar using --driver-class-path.  See the
following for an example.

MASTER=local ./bin/spark-submit --driver-class-path
./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
./examples/src/main/python/hbase_inputformat.py localhost test

On Tue, Sep 23, 2014 at 9:39 AM, Gilberto Lira  wrote:

> Hi,
>
> i'm trying to run hbase_inputformat.py example but i'm not getting.
>
> this is the error:
>
> Traceback (most recent call last):
>   File "/root/spark/examples/src/main/python/hbase_inputformat.py", line
> 70, in 
> conf=conf)
>   File "/root/spark/python/pyspark/context.py", line 471, in
> newAPIHadoopRDD
> jconf, batchSize)
>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
> line 538, in __call__
>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
> line 300, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling
> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
> : java.lang.ClassNotFoundException:
> org.apache.hadoop.hbase.io.ImmutableBytesWritable
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
> at
> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
> at
> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
> at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:207)
> at java.lang.Thread.run(Thread.java:745)
>
> can anyone help me?
>


Re: Spark 1.1.0 hbase_inputformat.py not work

2014-10-01 Thread Kan Zhang
CC user@ for indexing.

Glad you fixed it. All source code for these examples are under
SPARK_HOME/examples. For example, the converters used here are in
examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala

Btw, you may find our blog post useful.
https://databricks.com/blog/2014/09/17/spark-1-1-bringing-hadoop-inputoutput-formats-to-pyspark.html

On Wed, Oct 1, 2014 at 6:54 AM, Gilberto Lira  wrote:

> Exactly Kan, this was the problem!!
>
> By the way, I have not found the source code of these examples, you know
> where I can find?
>
> Thanks
>
> 2014-10-01 1:37 GMT-03:00 Kan Zhang :
>
>> I somehow missed this. Do you still have problem? You probably didn't
>> specify the correct spark-examples jar using --driver-class-path.  See
>> the following for an example.
>>
>> MASTER=local ./bin/spark-submit --driver-class-path
>> ./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
>> ./examples/src/main/python/hbase_inputformat.py localhost test
>>
>> On Tue, Sep 23, 2014 at 9:39 AM, Gilberto Lira 
>> wrote:
>>
>>> Hi,
>>>
>>> i'm trying to run hbase_inputformat.py example but i'm not getting.
>>>
>>> this is the error:
>>>
>>> Traceback (most recent call last):
>>>   File "/root/spark/examples/src/main/python/hbase_inputformat.py", line
>>> 70, in 
>>> conf=conf)
>>>   File "/root/spark/python/pyspark/context.py", line 471, in
>>> newAPIHadoopRDD
>>> jconf, batchSize)
>>>   File
>>> "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line
>>> 538, in __call__
>>>   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
>>> line 300, in get_return_value
>>> py4j.protocol.Py4JJavaError: An error occurred while calling
>>> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
>>> : java.lang.ClassNotFoundException:
>>> org.apache.hadoop.hbase.io.ImmutableBytesWritable
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:270)
>>> at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
>>> at
>>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
>>> at
>>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
>>> at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>>> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>>> at py4j.Gateway.invoke(Gateway.java:259)
>>> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>>> at py4j.GatewayConnection.run(GatewayConnection.java:207)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> can anyone help me?
>>>
>>
>>
>


Re: Spark 1.1.0 hbase_inputformat.py not work

2014-10-01 Thread Gilberto Lira
Thank you Zhang!

I am grateful for your help!

2014-10-01 14:05 GMT-03:00 Kan Zhang :

> CC user@ for indexing.
>
> Glad you fixed it. All source code for these examples are under
> SPARK_HOME/examples. For example, the converters used here are in
> examples/src/main/scala/org/apache/spark/examples/pythonconverters/HBaseConverters.scala
>
> Btw, you may find our blog post useful.
>
> https://databricks.com/blog/2014/09/17/spark-1-1-bringing-hadoop-inputoutput-formats-to-pyspark.html
>
> On Wed, Oct 1, 2014 at 6:54 AM, Gilberto Lira  wrote:
>
>> Exactly Kan, this was the problem!!
>>
>> By the way, I have not found the source code of these examples, you know
>> where I can find?
>>
>> Thanks
>>
>> 2014-10-01 1:37 GMT-03:00 Kan Zhang :
>>
>>> I somehow missed this. Do you still have problem? You probably didn't
>>> specify the correct spark-examples jar using --driver-class-path.  See
>>> the following for an example.
>>>
>>> MASTER=local ./bin/spark-submit --driver-class-path
>>> ./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
>>> ./examples/src/main/python/hbase_inputformat.py localhost test
>>>
>>> On Tue, Sep 23, 2014 at 9:39 AM, Gilberto Lira 
>>> wrote:
>>>
 Hi,

 i'm trying to run hbase_inputformat.py example but i'm not getting.

 this is the error:

 Traceback (most recent call last):
   File "/root/spark/examples/src/main/python/hbase_inputformat.py",
 line 70, in 
 conf=conf)
   File "/root/spark/python/pyspark/context.py", line 471, in
 newAPIHadoopRDD
 jconf, batchSize)
   File
 "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line
 538, in __call__
   File "/root/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py",
 line 300, in get_return_value
 py4j.protocol.Py4JJavaError: An error occurred while calling
 z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
 : java.lang.ClassNotFoundException:
 org.apache.hadoop.hbase.io.ImmutableBytesWritable
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:270)
 at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
 at
 org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
 at
 org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
 at
 org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
 at py4j.Gateway.invoke(Gateway.java:259)
 at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
 at py4j.commands.CallCommand.execute(CallCommand.java:79)
 at py4j.GatewayConnection.run(GatewayConnection.java:207)
 at java.lang.Thread.run(Thread.java:745)

 can anyone help me?

>>>
>>>
>>
>