Hi All,
We have 10 tables in data warehouse (hdfs/hive) written using ORC format. We
are serving a usecase on top of that by joining 4-5 tables using Hive as of
now. But it is not fast as we wanted it to be, so we are thinking of using
spark for this use case.
Any suggestion on this ? Is it
I don’t understand this change. Wouldn’t this “ban” confuse the hell out of
both new and old users?
For old users, their old code that was working for char(3) would now stop
working.
For new users, depending on whether the underlying metastore char(3) is
either supported but different from ansi
Hi, All.
Apache Spark has been suffered from a known consistency issue on `CHAR`
type behavior among its usages and configurations. However, the evolution
direction has been gradually moving forward to be consistent inside Apache
Spark because we don't have `CHAR` offically. The following is the
I have a process in Apache Spark that attempts to write HFiles to S3 in a
batched process. I want the resulting HFiles in the same directory, as they are
in the same column family. However, I'm getting a 'directory already exists
error' when I try to run this on AWS EMR. How can I write Hfiles
WARN NativeCodeLoader: Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
*I was chasing this warning when I found misinformationfrom SPARK training
companies such as eudreka who offer weird and wonderful suggestions*