BenjaminSSL opened a new issue, #1388: URL: https://github.com/apache/polaris/issues/1388
### Describe the bug During concurrent parralel http requests of type **CREATE** and **DELETE** on the catalog and principal entity, while listing catalogs or principals, I am getting a **NullPointerException**. Polaris logs show the following: ``` INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Handling runtimeException null INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Full RuntimeException: java.lang.NullPointerException ERROR [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Unhandled exception returning INTERNAL_SERVER_ERROR: java.lang.NullPointerException ``` The returned http status code is **500** and the body for the catalog entity contains the following: ``` { "error": { "message":null, "type":"NullPointerException", "code":500 } } ``` For the principal entity, the body contains: ``` { "error": { "message":"Cannot invoke "org.apache.polaris.core.entity.PolarisBaseEntity getCatalogId()" because "sourceEntity" is null", "type":"NullPointerException", "code":500 } } ``` Stack trace: ``` ERROR [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-52) Unhandled exception returning INTERNAL_SERVER_ERROR: java.lang.NullPointerException: Cannot invoke "org.apache.polaris.core.entity.PolarisBaseEntity.getCatalogId()" because "sourceEntity" is null at org.apache.polaris.core.entity.PolarisEntity.<init>(PolarisEntity.java:187) at org.apache.polaris.core.entity.PrincipalEntity.<init>(PrincipalEntity.java:26) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575) at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616) at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622) at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627) at org.apache.polaris.service.admin.PolarisServiceImpl.listPrincipals(PolarisServiceImpl.java:256) at org.apache.polaris.service.admin.PolarisServiceImpl_ClientProxy.listPrincipals(Unknown Source) at org.apache.polaris.service.admin.api.PolarisPrincipalsApi.listPrincipals(PolarisPrincipalsApi.java:232) at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass.listPrincipals$$superforward(Unknown Source) at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass$$function$$6.apply(Unknown Source) at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73) at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62) at io.quarkus.micrometer.runtime.MicrometerTimedInterceptor.timedMethod(MicrometerTimedInterceptor.java:79) at io.quarkus.micrometer.runtime.MicrometerTimedInterceptor_Bean.intercept(Unknown Source) at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42) at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30) at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27) at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass.listPrincipals(Unknown Source) at org.apache.polaris.service.admin.api.PolarisPrincipalsApi$quarkusrestinvoker$listPrincipals_8247ae723efb90ecd5dc9ca10b28b13ed5c10c1d.invoke(Unknown Source) at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29) at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141) at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147) at io.quarkus.vertx.core.runtime.VertxCoreRecorder$15.runWith(VertxCoreRecorder.java:638) at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675) at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654) at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627) at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594) at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11) at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:1583) ``` ### To Reproduce Setup a test with two threads: 1. Thread A - This thread performs the following operations repeatedly: - Creates a new **catalog** or **principal** - Immediately deletes the catalog or principal after creation 2. Thread B - This thread repeatedly invokes the list operation for catalogs or principals. Run both threads simultaneously for approximately 10 seconds. During this time, you may observe occasional failures in the listing operations. Specifically, the `GET /catalogs` or `GET /principals` endpoints sometimes throw a **NullPointerException**. However, the issue is not consistently reproducible, and follows no specific pattern. These are the endpoints I used for the test, with the host and resource being `http://localhost:8181/api/management/v1`: - **Create Catalog**: `POST /catalogs` - **Delete Catalog**: `DELETE /catalogs/{catalogId}` - **List Catalogs**: `GET /catalogs` - **Create Principal**: `POST /principals` - **Delete Principal**: `DELETE /principals/{principalId}` - **List Principals**: `GET /principals` ### Actual Behavior **CREATE** operation sometimes return a **500** error with a **NullPointerException** in the logs. It is unclear to me why this is happening, but based on the stack trace, it seems to be related to the list operation trying to access a catalog or principal entity that is in the process of being created or deleted by another thread. ### Expected Behavior I would expect the list operation return the current state of catalogs or principal, even if there are concurrent operations happening. The list operation should be able to handle concurrent modifications gracefully. ### Additional context Test environment used the in-memory store. ### System information OS: MacOS Sonoma v15.4.1 Polaris Catalog Version: 1.0.0-incubating-SNAPSHOT Object Storage: FILE -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org