thanks for confirming.

it feel like parquet-reader is a standalone program
and there is no usage of it apart from tests

I trying to trace the operation between a query submitted
say "select * from some_parquet_table limit 10"
and impalad actually reading the parquet file

can you suggest some file for me to look at?

=======================

$git grep parquet-reader
be/src/util/CMakeLists.txt:add_executable(parquet-reader parquet-reader.cc)
be/src/util/CMakeLists.txt:target_link_libraries(parquet-reader ${IMPALA_LINK_LIBS}) tests/query_test/test_insert_parquet.py:    parquet table and running the parquet-reader tool on it, which performs sanity tests/query_test/test_insert_parquet.py: check_call([os.path.join(impalad_basedir, 'util/parquet-reader'), '--file',
On 2018/05/07 10:32, Jim Apple wrote:
It is related. Try git grep parquet-reader (without the .cc) to learn more.

On Sun, May 6, 2018 at 5:42 PM, Chan Chor Pang <[email protected]>
wrote:

hi everyone

having a problem with long running cluster
query on parquet table will freeze after long run( alround 2~3 month)
cant found any error message related, those query just stop progress.
after some research, i suspect IMPALA-5742 may be the problem
because i also observe some memory leak symptom from impalad
although its wired that the memory leak symptom can only observed from the
coordinate node
and i cant found any usage of parquet-reader.cc by simply grep from impala
source

so is parquet-reader.cc even related to reading parquet file by impalad?


--
---*------------------------------------------------*---*---*---*---
株式会社INDETAIL
ニアショア総合サービス事業本部
ビジネスソリューション事業部
陳 楚鵬
E-mail :[email protected]
URL : https://www.indetail.co.jp

【札幌本社/LABO/LABO2】
〒060-0042
札幌市中央区大通西9丁目3番地33 キタコーセンタービルディング 2階
TEL:011-206-9235 FAX:011-206-9236

【東京支店】
〒108-0014
東京都港区芝5丁目29番20号 クロスオフィス三田
TEL:03-6809-6502  FAX:03-6809-6504
---*------------------------------------------------*---*---*---*---

Reply via email to