Hi
I have a few of questions about a structure of HDFS and S3 when Spark-like
loads data from two storage.
Generally, when Spark loads data from HDFS, HDFS supports data locality and
already own distributed file on datanodes, right? Spark could just process
data on workers.
What about S3? m
What kind of steps exists when reading ORC format on Spark-SQL?
I meant usually reading csv file is just directly reading the dataset on
memory.
But I feel like Spark-SQL has some steps when reading ORC format.
For example, they have to create table to insert the dataset? and then they
insert the
Hi,
Simple Question about Spark Distribution of Small Dataset.
Let's say I have 8 machine with 48 cores and 48GB of RAM as a cluster.
Dataset (format is ORC by Hive) is so small like 1GB, but I copied it to
HDFS.
1) if spark-sql run the dataset distributed on HDFS in each machine, what
happens
alytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Philip Lee [mailto:philjj...@gmail.com]
> *Sent:* Monday, January 25, 2016 9:51 AM
> *To:* user@spark.apache.org
> *Subject:* Re: a question about we
?
2) still wondering how to see the log after copyting log file to my local.
The error was metioned in previous mail.
Thanks,
Phil
On Mon, Jan 25, 2016 at 5:36 PM, Philip Lee wrote:
> Hello, a questino about web UI log.
>
> I could see web interface log after forwarding the port on m
Hello, a questino about web UI log.
I could see web interface log after forwarding the port on my cluster to
my local and click completed application, but when I clicked "application
detail UI"
[image: Inline image 1]
It happened to me. I do not know why. I also checked the specific log
folder