Definitely something wrong. For me, 10 to 30 minutes.
Thanks.
Zhan Zhang
On Sep 23, 2014, at 10:02 PM, christy 760948...@qq.com wrote:
This process began yesterday and it has already run for more than 20 hours.
Is it normal? Any one has the same problem? No error throw out yet
Try this:
Import org.apache.spark.SparkContext._
Thanks.
Zhan Zhang
On Sep 4, 2014, at 4:36 PM, Veeranagouda Mukkanagoudar veera...@gmail.com
wrote:
I am planning to use RDD join operation, to test out i was trying to compile
some test code, but am getting following compilation error
://sandbox.hortonworks.com:8020/tmp/wordcount)
Thanks.
Zhan Zhang
On Aug 26, 2014, at 12:35 AM, motte1988 wir12...@studserv.uni-leipzig.de
wrote:
Hello,
it's me again.
Now I've got an explanation for the behaviour. It seems that the driver
memory is not large enough to hold the whole result set
I think it depends on your job. My personal experiences when I run TB data.
spark got loss connection failure if I use big JVM with large memory, but with
more executors with small memory, it can run very smoothly. I was running spark
on yarn.
Thanks.
Zhan Zhang
On Aug 21, 2014, at 3:42 PM
the reduceByKey because it is not cached.
I agree with you it is very confusing.
Thanks.
Zhan Zhang
The f
On Aug 20, 2014, at 2:28 PM, Patrick Wendell pwend...@gmail.com wrote:
The reason is that some operators get pipelined into a single stage.
rdd.map(XX).filter(YY) - this executes in a single
String HBASE_TABLE_NAME = hbase.table.name”;
Thanks.
Zhan Zhang
On Aug 17, 2014, at 11:39 PM, Cesar Arevalo ce...@zephyrhealthinc.com wrote:
HadoopRDD
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain
.
Zhan Zhang
On Aug 18, 2014, at 11:26 AM, Peng Cheng pc...@uow.edu.au wrote:
I'm curious to see that if you declare broadcasted wrapper as a var, and
overwrite it in the driver program, the modification can have stable impact
on all transformations/actions defined BEFORE the overwrite
I tried with simple spark-hive select and insert, and it works. But to directly
manipulate the ORCFile through RDD, spark has to be upgraded to support
hive-0.13 first. Because some ORC API is not exposed until Hive-0.12.
Thanks.
Zhan Zhang
On Aug 11, 2014, at 10:23 PM, vinay.kash
Yes. You are right, but I tried old hadoopFile for OrcInputFormat. In hive12,
OrcStruct is not exposing its api, so spark cannot access it. With Hive13, RDD
can read from OrcFile. Btw, I didn’t see ORCNewOutputFormat in hive-0.13.
Direct RDD manipulation (Hive13)
val inputRead =
I agree. We need the support similar to parquet file for end user. That’s the
purpose of Spark-2883.
Thanks.
Zhan Zhang
On Aug 14, 2014, at 11:42 AM, Yin Huai huaiyin@gmail.com wrote:
I feel that using hadoopFile and saveAsHadoopFile to read and write ORCFile
are more towards
101 - 110 of 110 matches
Mail list logo