Hi Pramod
Is your data compressed? I encountered similar problem,however, after turned
codegen on, the GC time was still very long.The size of input data for my map
task is about 100M lzo file.
My query is select ip, count(*) as c from stage_bitauto_adclick_d group by ip
sort by c limit 100
Hi
I was trying to create an external table named adclicktable by API def
createExternalTable(tableName: String, path: String),then I can get the schema
of this table successfully like below and this table can be queried
normally.The data files are all Parquet files.
sqlContext.sql(describe
Hi,
I did some tests on Parquet Files with Spark SQL DataFrame API.
I generated 36 gzip compressed parquet files by Spark SQL and stored them on
Tachyon,The size of each file is about 222M.Then read them with below code.
val tfs
,
zhangxiongfei wrote: Hi experts I run below code in Spark Shell to access
parquet files in Tachyon. 1.First,created a DataFrame by loading a bunch of
Parquet Files in Tachyon val ta3
=sqlContext.parquetFile(tachyon://tachyonserver:19998/apps/tachyon/zhangxf/parquetAdClick-6p-256m);
2.Second