thanks for confirming.
it feel like parquet-reader is a standalone program
and there is no usage of it apart from tests
I trying to trace the operation between a query submitted
say "select * from some_parquet_table limit 10"
and impalad actually reading the parquet file
can you suggest some file for me to look at?
=======================
$git grep parquet-reader
be/src/util/CMakeLists.txt:add_executable(parquet-reader parquet-reader.cc)
be/src/util/CMakeLists.txt:target_link_libraries(parquet-reader
${IMPALA_LINK_LIBS})
tests/query_test/test_insert_parquet.py: parquet table and running
the parquet-reader tool on it, which performs sanity
tests/query_test/test_insert_parquet.py:
check_call([os.path.join(impalad_basedir, 'util/parquet-reader'), '--file',
On 2018/05/07 10:32, Jim Apple wrote:
It is related. Try git grep parquet-reader (without the .cc) to learn more.
On Sun, May 6, 2018 at 5:42 PM, Chan Chor Pang <[email protected]>
wrote:
hi everyone
having a problem with long running cluster
query on parquet table will freeze after long run( alround 2~3 month)
cant found any error message related, those query just stop progress.
after some research, i suspect IMPALA-5742 may be the problem
because i also observe some memory leak symptom from impalad
although its wired that the memory leak symptom can only observed from the
coordinate node
and i cant found any usage of parquet-reader.cc by simply grep from impala
source
so is parquet-reader.cc even related to reading parquet file by impalad?
--
---*------------------------------------------------*---*---*---*---
株式会社INDETAIL
ニアショア総合サービス事業本部
ビジネスソリューション事業部
陳 楚鵬
E-mail :[email protected]
URL : https://www.indetail.co.jp
【札幌本社/LABO/LABO2】
〒060-0042
札幌市中央区大通西9丁目3番地33 キタコーセンタービルディング 2階
TEL:011-206-9235 FAX:011-206-9236
【東京支店】
〒108-0014
東京都港区芝5丁目29番20号 クロスオフィス三田
TEL:03-6809-6502 FAX:03-6809-6504
---*------------------------------------------------*---*---*---*---