Thanks
>
> 2014-10-01 1:37 GMT-03:00 Kan Zhang :
>
>> I somehow missed this. Do you still have problem? You probably didn't
>> specify the correct spark-examples jar using --driver-class-path. See
>> the following for an example.
>>
>> MASTER=local ./b
I somehow missed this. Do you still have problem? You probably didn't
specify the correct spark-examples jar using --driver-class-path. See the
following for an example.
MASTER=local ./bin/spark-submit --driver-class-path
./examples/target/scala-2.10/spark-examples-1.1.0-SNAPSHOT-hadoop1.0.4.jar
> java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.JobContext, but class was expected
Most likely it is the Hadoop 1 vs Hadoop 2 issue. The example was given for
Hadoop 1 (default Hadoop version for Spark). You may try to set the output
format class in conf for
possible to use only
>> cassandra - input/output without hadoop?
>> 3) I know there are couple of strategies for storage level, in case my
>> data set is quite big and I have no enough memory to process - can I use
>> DISK_ONLY option without hadoop (having only cassandra)
In Spark 1.1, it is possible to read from Cassandra using Hadoop jobs. See
examples/src/main/python/cassandra_inputformat.py for an example. You may
need to write your own key/value converters.
On Tue, Sep 2, 2014 at 11:10 AM, Oleg Ruchovets
wrote:
> Hi All ,
>Is it possible to have cassand
Good timing! I encountered that same issue recently and to address it, I
changed the default Class.forName call to Utils.classForName. See my patch
at https://github.com/apache/spark/pull/1916. After that change, my
bin/pyspark --jars worked.
On Wed, Aug 13, 2014 at 11:47 PM, Tassilo Klein wrote
Tassilo, newAPIHadoopRDD has been added to PySpark in master and
yet-to-be-released 1.1 branch. It allows you specify your custom
InputFormat. Examples of using it include hbase_inputformat.py and
cassandra_inputformat.py in examples/src/main/python. Check it out.
On Wed, Aug 13, 2014 at 3:12 PM,
Andrew, there are overloaded versions of saveAsHadoopFile or
saveAsNewAPIHadoopFile that allow you to pass in a per-job Hadoop conf.
saveAsTextFile is just a convenience wrapper on top of saveAsHadoopFile.
On Mon, Jul 14, 2014 at 11:22 PM, Andrew Ash wrote:
> In general it would be nice to be a
I couldn't reproduce your issue locally, but I suspect it has something to
do with partitioning. zip() does it by partition and it assumes the two
RDDs have the same number of partitions and the same number of elements in
each partition. By default, map() doesn't preserve partitioning. Try set
pres
Yes, it can if you set the output format to SequenceFileOutputFormat. The
difference is saveAsSequenceFile does the conversion to Writable for you if
needed and then calls saveAsHadoopFile.
On Fri, Jun 20, 2014 at 12:43 AM, abhiguruvayya
wrote:
> Does JavaPairRDD.saveAsHadoopFile store data as
Can you use saveAsObjectFile?
On Thu, Jun 19, 2014 at 5:54 PM, abhiguruvayya
wrote:
> I want to store JavaRDD as a sequence file instead of textfile. But i don't
> see any Java API for that. Is there a way for this? Please let me know.
> Thanks!
>
>
>
> --
> View this message in context:
> http
11 matches
Mail list logo