[ 
https://issues.apache.org/jira/browse/IMPALA-11677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17620713#comment-17620713
 ] 

Qihong Jiang edited comment on IMPALA-11677 at 10/20/22 5:05 AM:
-----------------------------------------------------------------

Hello !, [~csringhofer] I'm only using non-transactional tables right now and 
it's equally slow. I tried using the Bulk API last week, but the improvement 
was very small. Then I referenced the code in impala3 and modified it to be an 
asynchronous call. The execution speed is greatly improved, but I don't know if 
there is any risk. 
{code:java}
public static List<Long> fireInsertEvents(MetaStoreClient msClient,
     TableInsertEventInfo insertEventInfo, String dbName, String tableName) {
    if (!insertEventInfo.isTransactional()) {
      LOG.info("fire the insert events asynchronously.");
      ExecutorService fireInsertEventThread = 
Executors.newSingleThreadExecutor();
      CompletableFuture.runAsync(() -> {
        try {
          fireInsertEventHelper(msClient.getHiveClient(),
                  insertEventInfo.getInsertEventReqData(),
                  insertEventInfo.getInsertEventPartVals(), dbName,
                  tableName);
        } catch(Exception e) {
          LOG.error("failed to async call fireInsertEventHelper");
        } finally {
              msClient.close();
              LOG.info("fire the insert events asynchronously end.");
           }     
         }, fireInsertEventThread)
              .thenRun(() -> fireInsertEventThread.shutdown());
    } else {
      Stopwatch sw = Stopwatch.createStarted();
      try {
        fireInsertTransactionalEventHelper(msClient.getHiveClient(),
                insertEventInfo, dbName, tableName);
      } catch (Exception e) {
        LOG.error("Failed to fire insert event. Some tables might not be"
                + " refreshed on other impala clusters.", e);
      } finally {
        LOG.info("Time taken to fire insert events on table {}.{}: {} msec", 
dbName,
                tableName, sw.stop().elapsed(TimeUnit.MILLISECONDS));
        msClient.close();
      }
    }    return Collections.emptyList();
  }{code}
     I am an impala newbie. I hope to get your guidance. Thank you!

 


was (Author: JIRAUSER289149):
Hello !, [~csringhofer] I'm only using non-transactional tables right now and 
it's equally slow. I tried using the Bulk API last week, but the improvement 
was very small. Then I referenced the code in impala3 and modified it to be an 
asynchronous call. The execution speed is greatly improved, but I don't know if 
there is any risk. 
{code:java}
public static List<Long> fireInsertEvents(MetaStoreClient msClient,
     TableInsertEventInfo insertEventInfo, String dbName, String tableName) {
    if (!insertEventInfo.isTransactional()) {
      LOG.info("fire the insert events asynchronously.");
      ExecutorService fireInsertEventThread = 
Executors.newSingleThreadExecutor();
      CompletableFuture.runAsync(() -> {
        try {
          fireInsertEventHelper(msClient.getHiveClient(),
                  insertEventInfo.getInsertEventReqData(),
                  insertEventInfo.getInsertEventPartVals(), dbName,
                  tableName);
        } catch(Exception e) {
          LOG.error("failed to async call fireInsertEventHelper");
        } finally {
              msClient.close();
              LOG.info("fire the insert events asynchronously end.");
           }     
         }, fireInsertEventThread)
              .thenRun(() -> fireInsertEventThread.shutdown());
    } else {
      Stopwatch sw = Stopwatch.createStarted();
      try {
        fireInsertTransactionalEventHelper(msClient.getHiveClient(),
                insertEventInfo, dbName, tableName);
      } catch (Exception e) {
        LOG.error("Failed to fire insert event. Some tables might not be"
                + " refreshed on other impala clusters.", e);
      } finally {
        LOG.info("Time taken to fire insert events on table {}.{}: {} msec", 
dbName,
                tableName, sw.stop().elapsed(TimeUnit.MILLISECONDS));
        msClient.close();
      }
    }    return Collections.emptyList();
  }{code}
     I am not an expert in impala. I hope to get your guidance. Thank you!

 

> FireInsertEvents function can be very slow for tables with large number of 
> partitions.
> --------------------------------------------------------------------------------------
>
>                 Key: IMPALA-11677
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11677
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>    Affects Versions: Impala 4.1.0
>            Reporter: Qihong Jiang
>            Assignee: Qihong Jiang
>            Priority: Major
>
> In src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java. 
> fireInsertEvents function can be very slow for tables with large number of 
> partitions. So we should use asynchronous calls.Just like in impala-3.x



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to