[ https://issues.apache.org/jira/browse/IMPALA-9896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
abeltian updated IMPALA-9896: ----------------------------- Description: OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is enabled. The basic information of the large table is as follows: 101 columns, 785243 partitions, 5729866 files. {code:java} I0626 20:59:04.029678 3392438 jni-util.cc:256] java.lang.OutOfMemoryError: Requested array size exceeds VM limit I0626 20:59:04.030231 3392438 status.cc:124] OutOfMemoryError: Requested array size exceeds VM limit @ 0xb35f19 @ 0x113112e @ 0xb23b87 @ 0xb0e339 @ 0xc15a52 @ 0xc09e4c @ 0xb01de9 @ 0xf159e8 @ 0xf0cd7e @ 0xf0dc11 @ 0x11a1e3f @ 0x11a29e9 @ 0x1790be9 @ 0x7f55188a2e24 @ 0x7f55185cf35c E0626 20:59:04.030258 3392438 catalog-server.cc:176] OutOfMemoryError: Requested array size exceeds VM limit {code} The source code corresponding to the error is as follows: {code:java} void GetPartialCatalogObject(TGetPartialCatalogObjectResponse& resp, const TGetPartialCatalogObjectRequest& req) override { // TODO(todd): capture detailed metrics on the types of inbound requests, lock // wait times, etc. // TODO(todd): add some kind of limit on the number of concurrent requests here // to avoid thread exhaustion -- eg perhaps it would be best to use a trylock // on the catalog locks, or defer these calls to a separate (bounded) queue, // so a heavy query workload against a table undergoing a slow refresh doesn't // end up taking down the catalog by creating thousands of threads. VLOG_RPC << "GetPartialCatalogObject(): request=" << ThriftDebugString(req); Status status = catalog_server_->catalog()->GetPartialCatalogObject(req, &resp); if (!status.ok()) LOG(ERROR) << status.GetDetail(); //catalog-server.cc:176 TStatus thrift_status; status.ToThrift(&thrift_status); resp.__set_status(thrift_status); VLOG_RPC << "GetPartialCatalogObject(): response=" << ThriftDebugString(resp); } {code} was: OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is enabled. The basic information of the large table is as follows: 101 columns, 785243 partitions, 5729866 files. {code:java} I0626 20:59:04.029678 3392438 jni-util.cc:256] java.lang.OutOfMemoryError: Requested array size exceeds VM limit I0626 20:59:04.030231 3392438 status.cc:124] OutOfMemoryError: Requested array size exceeds VM limit @ 0xb35f19 @ 0x113112e @ 0xb23b87 @ 0xb0e339 @ 0xc15a52 @ 0xc09e4c @ 0xb01de9 @ 0xf159e8 @ 0xf0cd7e @ 0xf0dc11 @ 0x11a1e3f @ 0x11a29e9 @ 0x1790be9 @ 0x7f55188a2e24 @ 0x7f55185cf35c E0626 20:59:04.030258 3392438 catalog-server.cc:176] OutOfMemoryError: Requested array size exceeds VM limit {code} {code:java} void GetPartialCatalogObject(TGetPartialCatalogObjectResponse& resp, const TGetPartialCatalogObjectRequest& req) override { // TODO(todd): capture detailed metrics on the types of inbound requests, lock // wait times, etc. // TODO(todd): add some kind of limit on the number of concurrent requests here // to avoid thread exhaustion -- eg perhaps it would be best to use a trylock // on the catalog locks, or defer these calls to a separate (bounded) queue, // so a heavy query workload against a table undergoing a slow refresh doesn't // end up taking down the catalog by creating thousands of threads. VLOG_RPC << "GetPartialCatalogObject(): request=" << ThriftDebugString(req); Status status = catalog_server_->catalog()->GetPartialCatalogObject(req, &resp); if (!status.ok()) LOG(ERROR) << status.GetDetail(); //catalog-server.cc:176 TStatus thrift_status; status.ToThrift(&thrift_status); resp.__set_status(thrift_status); VLOG_RPC << "GetPartialCatalogObject(): response=" << ThriftDebugString(resp); } {code} {code:java} {code} > OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is > enabled > ------------------------------------------------------------------------------------ > > Key: IMPALA-9896 > URL: https://issues.apache.org/jira/browse/IMPALA-9896 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Affects Versions: Impala 3.2.0 > Reporter: abeltian > Priority: Major > Labels: crashed, performance > > OutOfMemoryError: Requested array size exceeds VM limit when LocalCatalog is > enabled. > The basic information of the large table is as follows: > 101 columns, 785243 partitions, 5729866 files. > {code:java} > I0626 20:59:04.029678 3392438 jni-util.cc:256] java.lang.OutOfMemoryError: > Requested array size exceeds VM limit > I0626 20:59:04.030231 3392438 status.cc:124] OutOfMemoryError: Requested > array size exceeds VM limit > @ 0xb35f19 > @ 0x113112e > @ 0xb23b87 > @ 0xb0e339 > @ 0xc15a52 > @ 0xc09e4c > @ 0xb01de9 > @ 0xf159e8 > @ 0xf0cd7e > @ 0xf0dc11 > @ 0x11a1e3f > @ 0x11a29e9 > @ 0x1790be9 > @ 0x7f55188a2e24 > @ 0x7f55185cf35c > E0626 20:59:04.030258 3392438 catalog-server.cc:176] OutOfMemoryError: > Requested array size exceeds VM limit > {code} > The source code corresponding to the error is as follows: > {code:java} > void GetPartialCatalogObject(TGetPartialCatalogObjectResponse& resp, > const TGetPartialCatalogObjectRequest& req) override { > // TODO(todd): capture detailed metrics on the types of inbound > requests, lock > // wait times, etc. > // TODO(todd): add some kind of limit on the number of concurrent > requests here > // to avoid thread exhaustion -- eg perhaps it would be best to use a > trylock > // on the catalog locks, or defer these calls to a separate (bounded) > queue, > // so a heavy query workload against a table undergoing a slow refresh > doesn't > // end up taking down the catalog by creating thousands of threads. > VLOG_RPC << "GetPartialCatalogObject(): request=" << > ThriftDebugString(req); > Status status = > catalog_server_->catalog()->GetPartialCatalogObject(req, &resp); > if (!status.ok()) LOG(ERROR) << status.GetDetail(); > //catalog-server.cc:176 > TStatus thrift_status; > status.ToThrift(&thrift_status); > resp.__set_status(thrift_status); > VLOG_RPC << "GetPartialCatalogObject(): response=" << > ThriftDebugString(resp); > } > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org