from:"CPC"

spark hive concurrency

2019-04-29 Thread CPC

Hi All, Does spark2 support concurrency on hive tables? I mean when we query with hive and issue show locks we can see shared locks. But when we use spark sql and query tables we could not see any locks on tables. Thanks in advance..

Re: Spark Optimization

2018-04-26 Thread CPC

I would recommend UseParallelGC since this is a batch job. Parallelization should be 2-3x of cores. Also if those are physical machines i would recommend 9000 as network mtu. Is 128 gb per node or 64 gb per node? On Thu, Apr 26, 2018, 7:40 PM vincent gromakowski < vincent.gromakow...@gmail.com>

Re: parquet late column materialization

2018-03-18 Thread CPC

this kind of optimization ( https://aws.amazon.com/about-aws/whats-new/2017/12/amazon-redshift-introduces-late-materialization-for-faster-query-processing/ ) Thanks.. On Mar 18, 2018 8:09 PM, "nguyen duc Tuan" <newvalu...@gmail.com> wrote: > Hi @CPC, > Parquet is column storage

parquet late column materialization

2018-03-18 Thread CPC

Hi everybody, I try to understand how spark reading parquet files but i am confused a little bit. I have a table with 4 columns and named businesskey,transactionname,request and response Request and response columns are huge columns(10-50kb). when i execute a query like "select * from mytable