+user list 2017-11-02 9:57 GMT+08:00 俊杰陈 <cjjnj...@gmail.com>:
> Hi Mostafa > > Cheng already put the profile in thread. > > Here is another profile for impala release version. you can also see the > attachment. > > > 2017-11-02 9:30 GMT+08:00 Mostafa Mokhtar <mmokh...@cloudera.com>: > >> Attaching the query profile will be most helpful to investigate this >> issue. >> >> If you can capture the profile from the WebUI on the coordinator node it >> would be great. >> >> On Wed, Nov 1, 2017 at 6:22 PM, 俊杰陈 <cjjnj...@gmail.com> wrote: >> >> > Thanks Hongxu, >> > >> > Here are configurations on my cluster, most of them are default values. >> > Which item do you think it may impact? >> > >> > ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0] >> > ABORT_ON_ERROR: [0] >> > ALLOW_UNSUPPORTED_FORMATS: [0] >> > APPX_COUNT_DISTINCT: [0] >> > BATCH_SIZE: [0] >> > COMPRESSION_CODEC: [NONE] >> > DEBUG_ACTION: [] >> > DEFAULT_ORDER_BY_LIMIT: [-1] >> > DISABLE_CACHED_READS: [0] >> > DISABLE_CODEGEN: [0] >> > DISABLE_OUTERMOST_TOPN: [0] >> > DISABLE_ROW_RUNTIME_FILTERING: [0] >> > DISABLE_STREAMING_PREAGGREGATIONS: [0] >> > DISABLE_UNSAFE_SPILLS: [0] >> > ENABLE_EXPR_REWRITES: [1] >> > EXEC_SINGLE_NODE_ROWS_THRESHOLD: [100] >> > EXPLAIN_LEVEL: [1] >> > HBASE_CACHE_BLOCKS: [0] >> > HBASE_CACHING: [0] >> > MAX_BLOCK_MGR_MEMORY: [0] >> > MAX_ERRORS: [100] >> > MAX_IO_BUFFERS: [0] >> > MAX_NUM_RUNTIME_FILTERS: [10] >> > MAX_SCAN_RANGE_LENGTH: [0] >> > MEM_LIMIT: [0] >> > MT_DOP: [0] >> > NUM_NODES: [0] >> > NUM_SCANNER_THREADS: [0] >> > OPTIMIZE_PARTITION_KEY_SCANS: [0] >> > PARQUET_ANNOTATE_STRINGS_UTF8: [0] >> > PARQUET_FALLBACK_SCHEMA_RESOLUTION: [0] >> > PARQUET_FILE_SIZE: [0] >> > PREFETCH_MODE: [1] >> > QUERY_TIMEOUT_S: [0] >> > REPLICA_PREFERENCE: [0] >> > REQUEST_POOL: [] >> > RESERVATION_REQUEST_TIMEOUT: [0] >> > RM_INITIAL_MEM: [0] >> > RUNTIME_BLOOM_FILTER_SIZE: [1048576] >> > RUNTIME_FILTER_MAX_SIZE: [16777216] >> > RUNTIME_FILTER_MIN_SIZE: [1048576] >> > RUNTIME_FILTER_MODE: [2] >> > RUNTIME_FILTER_WAIT_TIME_MS: [0] >> > S3_SKIP_INSERT_STAGING: [1] >> > SCAN_NODE_CODEGEN_THRESHOLD: [1800000] >> > SCHEDULE_RANDOM_REPLICA: [0] >> > SCRATCH_LIMIT: [-1] >> > SEQ_COMPRESSION_MODE: [0] >> > STRICT_MODE: [0] >> > SUPPORT_START_OVER: [false] >> > SYNC_DDL: [0] >> > V_CPU_CORES: [0] >> > >> > 2017-10-31 15:30 GMT+08:00 Hongxu Ma <inte...@outlook.com>: >> > >> > > Hi JJ >> > > Consider it only takes 3mins on SparkSQL, maybe there are some >> mistakes >> > in >> > > query options. >> > > Try run "set;" in impala-shell and check all query options, e.g: >> > > BATCH_SIZE: [0] >> > > DISABLE_CODEGEN: [0] >> > > RUNTIME_FILTER_MODE: GLOBAL >> > > >> > > Just a guess, thanks. >> > > >> > > 在 27/10/2017 10:25, 俊杰陈 写道: >> > > The profile file is damaged. Here is a screenshot for exec summary >> > > [cid:ii_j999ymep1_15f5ba563aeabb91] >> > > >> > > >> > > 2017-10-27 10:04 GMT+08:00 俊杰陈 <cjjnj...@gmail.com<mailto:cjj >> > > nj...@gmail.com>>: >> > > Hi Devs >> > > >> > > I met a performance issue on big table join. The query takes more >> than 3 >> > > hours on Impala and only 3 minutes on Spark SQL on the same 5 nodes >> > > cluster. when running query, the left scanner and exchange node are >> very >> > > slow. Did I miss some key arguments? >> > > >> > > you can see profile file in attachment. >> > > >> > > [cid:ii_j9998pph2_15f5b92f2cf47020] >> > > >> > > -- >> > > Thanks & Best Regards >> > > >> > > >> > > >> > > -- >> > > Thanks & Best Regards >> > > >> > > >> > > -- >> > > Regards, >> > > Hongxu. >> > > >> > >> > >> > >> > -- >> > Thanks & Best Regards >> > >> > > > > -- > Thanks & Best Regards > -- Thanks & Best Regards