To: Evo Eftimov
Cc: Christian Perez; user
Subject: Re: Super slow caching in 1.3?
Here are the types that we specialize, other types will be much slower.
This is only for Spark SQL, normal RDDs do not serialize data that is
cached. I'll also not that until yesterday we were missing FloatType
.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Christian Perez
Silicon Valley Data Science
Data Analyst
christ...@svds.com
@cp_phd
just falling back to
kryo and even then there are some locking issues).
If so, would it be possible to try caching a flattened version?
CACHE TABLE flattenedTable AS SELECT ... FROM parquetTable
On Mon, Apr 6, 2015 at 5:00 PM, Christian Perez christ...@svds.com wrote:
Hi all,
Has anyone else
Hi all,
Has anyone else noticed very slow time to cache a Parquet file? It
takes 14 s per 235 MB (1 block) uncompressed node local Parquet file
on M2 EC2 instances. Or are my expectations way off...
Cheers,
Christian
--
Christian Perez
Silicon Valley Data Science
Data Analyst
christ
at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Christian Perez
Silicon Valley Data Science
Data Analyst
christ...@svds.com
@cp_phd
.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org
--
Christian Perez
Silicon Valley Data Science
Data Analyst
christ...@svds.com
@cp_phd
Any other users interested in a feature
DataFrame.saveAsExternalTable() for making _useful_ external tables in
Hive, or am I the only one? Bueller? If I start a PR for this, will it
be taken seriously?
On Thu, Mar 19, 2015 at 9:34 AM, Christian Perez christ...@svds.com wrote:
Hi Yin,
Thanks
... but before I get any
deeper, can anyone reproduce this behavior?
Cheers,
Christian
--
Christian Perez
Silicon Valley Data Science
Data Analyst
christ...@svds.com
@cp_phd
-
To unsubscribe, e-mail: user-unsubscr
will be org.apache.spark.sql.parquet.DefaultSource. You can also
look at your files in the file system. They are stored by Parquet.
Thanks,
Yin
On Thu, Mar 19, 2015 at 12:00 PM, Christian Perez christ...@svds.com
wrote:
Hi all,
DataFrame.saveAsTable creates a managed table in Hive (v0.13