Re: [PR] [#9543] feat(jobs): Add the built-in Iceberg rewrite data files job template to Gravitino [gravitino]

via GitHub Thu, 29 Jan 2026 09:56:51 -0800


jerryshao commented on PR #9588:
URL: https://github.com/apache/gravitino/pull/9588#issuecomment-3819287560


   > some problems while test the rewrite job
   > 
   > 1. The Gravitino job jar doesn't include the Gravitino API package will 
introduce class not found error when executiong Spark jobs
   > 2. missing jar or archive handle logic, so I had to place Iceberg Spark 
runtime jar to the jar under Spark home
   > 3. it takes too long time to sync the finished status to the client
   > 
   > ```
   > 2026-01-29 09:54:47.037 WARN [LocalJobExecutor-100] 
[org.apache.gravitino.job.local.LocalJobExecutor.runJob(LocalJobExecutor.java:295)]
 - Job local-job-f340c4d9-8ef0-4f44-a63e-c46740e54246 failed after starting 
with exit code: 1
   > 2026-01-29 09:59:28.008 INFO [job-status-pull] 
[org.apache.gravitino.job.JobManager.lambda$pullAndUpdateJobStatus$19(JobManager.java:616)]
 - Updated the job job-2097194451929410612 with execution id 
local-job-f340c4d9-8ef0-4f44-a63e-c46740e54246 status to FAILED
   > ```
   > 
   > For the escaping issues, seems we didn't need to do special handing, as we 
use `ProcessBuilder` not a `shell` to execute Spark command.
   
   For the 1st issue, I can package the api jar into the shading packaging. 
   
   For the 2nd problem, should I shade the Iceberg jar into this jar? If so, it 
will be bound to the specific Iceberg/Spark version. So my initial thought is 
not to put the Iceberg dependency into this job jar. It can be handled by 
users. What's your opinion?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [#9543] feat(jobs): Add the built-in Iceberg rewrite data files job template to Gravitino [gravitino]

Reply via email to