Thanks Hongxu,
Here are configurations on my cluster, most of them are default values.
Which item do you think it may impact?
ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0]
ABORT_ON_ERROR: [0]
ALLOW_UNSUPPORTED_FORMATS: [0]
APPX_COUNT_DISTINCT: [0]
BATCH_SIZE: [0]
COMPRESSION_CODEC: [NONE]
DEBUG_ACTION: []
DEFAULT_ORDER_BY_LIMIT: [-1]
DISABLE_CACHED_READS: [0]
DISABLE_CODEGEN: [0]
DISABLE_OUTERMOST_TOPN: [0]
DISABLE_ROW_RUNTIME_FILTERING: [0]
DISABLE_STREAMING_PREAGGREGATIONS: [0]
DISABLE_UNSAFE_SPILLS: [0]
ENABLE_EXPR_REWRITES: [1]
EXEC_SINGLE_NODE_ROWS_THRESHOLD: [100]
EXPLAIN_LEVEL: [1]
HBASE_CACHE_BLOCKS: [0]
HBASE_CACHING: [0]
MAX_BLOCK_MGR_MEMORY: [0]
MAX_ERRORS: [100]
MAX_IO_BUFFERS: [0]
MAX_NUM_RUNTIME_FILTERS: [10]
MAX_SCAN_RANGE_LENGTH: [0]
MEM_LIMIT: [0]
MT_DOP: [0]
NUM_NODES: [0]
NUM_SCANNER_THREADS: [0]
OPTIMIZE_PARTITION_KEY_SCANS: [0]
PARQUET_ANNOTATE_STRINGS_UTF8: [0]
PARQUET_FALLBACK_SCHEMA_RESOLUTION: [0]
PARQUET_FILE_SIZE: [0]
PREFETCH_MODE: [1]
QUERY_TIMEOUT_S: [0]
REPLICA_PREFERENCE: [0]
REQUEST_POOL: []
RESERVATION_REQUEST_TIMEOUT: [0]
RM_INITIAL_MEM: [0]
RUNTIME_BLOOM_FILTER_SIZE: [1048576]
RUNTIME_FILTER_MAX_SIZE: [16777216]
RUNTIME_FILTER_MIN_SIZE: [1048576]
RUNTIME_FILTER_MODE: [2]
RUNTIME_FILTER_WAIT_TIME_MS: [0]
S3_SKIP_INSERT_STAGING: [1]
SCAN_NODE_CODEGEN_THRESHOLD: [1800000]
SCHEDULE_RANDOM_REPLICA: [0]
SCRATCH_LIMIT: [-1]
SEQ_COMPRESSION_MODE: [0]
STRICT_MODE: [0]
SUPPORT_START_OVER: [false]
SYNC_DDL: [0]
V_CPU_CORES: [0]
2017-10-31 15:30 GMT+08:00 Hongxu Ma <[email protected]>:
> Hi JJ
> Consider it only takes 3mins on SparkSQL, maybe there are some mistakes in
> query options.
> Try run "set;" in impala-shell and check all query options, e.g:
> BATCH_SIZE: [0]
> DISABLE_CODEGEN: [0]
> RUNTIME_FILTER_MODE: GLOBAL
>
> Just a guess, thanks.
>
> 在 27/10/2017 10:25, 俊杰陈 写道:
> The profile file is damaged. Here is a screenshot for exec summary
> [cid:ii_j999ymep1_15f5ba563aeabb91]
>
>
> 2017-10-27 10:04 GMT+08:00 俊杰陈 <[email protected]<mailto:cjj
> [email protected]>>:
> Hi Devs
>
> I met a performance issue on big table join. The query takes more than 3
> hours on Impala and only 3 minutes on Spark SQL on the same 5 nodes
> cluster. when running query, the left scanner and exchange node are very
> slow. Did I miss some key arguments?
>
> you can see profile file in attachment.
>
> [cid:ii_j9998pph2_15f5b92f2cf47020]
>
> --
> Thanks & Best Regards
>
>
>
> --
> Thanks & Best Regards
>
>
> --
> Regards,
> Hongxu.
>
--
Thanks & Best Regards