I wanted to ask a general question about Hadoop/Yarn and Apache Spark
integration. I know that
Hadoop on a physical cluster has rack awareness. i.e. It attempts to minimise
network traffic
by saving replicated blocks within a rack. i.e.
I wondered whether, when Spark is configured to use
Hi
I have a five node CDH 5.3 cluster running on CentOS 6.5, I also have a
separate
install of Spark 1.3.1. ( The CDH 5.3 install has Spark 1.2 but I wanted a
newer version. )
I managed to write some Scala based code using a Hive Context to connect to
Hive and
create/populate tables etc.
Hi
Is it true that if I want to use Spark SQL ( for Spark 1.3.1 ) against Apache
Hive I need to build a source version of Spark ?
Im using CDH 5.3 on CentOS Linux 6.5 which uses Hive 0.13.0 ( I think ).
cheers
Mike F
Hi
I have a question about Spark Twitter stream processing in Spark 1.3.1, the
code sample below just opens
up a twitter stream, uses auth keys, splits out has tags and creates a temp
table. However, when I try to compile
it using sbt ( CentOS 6.5) I get the error
[error]
Hi
Im getting the following error when trying to process a csv based data file.
Exception in thread main org.apache.spark.SparkException: Job aborted due to
stage failure: Task 1 in stage 10.0 failed 4 times, most recent failure: Lost
task 1.3 in stage 10.0 (TID 262,